jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.31k stars 426 forks source link

how to determine reasonable max steps? #138

Open ScottishFold007 opened 5 months ago

ScottishFold007 commented 5 months ago

Hello, you have a great program! It has been very beneficial! One question I would like to ask is how to determine reasonable max steps based on the amount of data available (e.g. tokens and model parameter count)?or do you have any good ideas in this regard?

ChaosCodes commented 5 months ago

Hi you can determine the max steps based on how much tokens you want to train when using cos lr schedule.