allenai / OLMo

Modeling, training, eval, and inference code for OLMo
https://allenai.org/olmo
Apache License 2.0
4.75k stars 485 forks source link

Difference between 0724 and 0424 7B models #746

Open jiahai-feng opened 2 weeks ago

jiahai-feng commented 2 weeks ago

📚 The doc issue

What is the difference between the 0724 and 0424 model? I can't find documentation any where. It seems like the official config files are identical. Looking at the intermediate checkpoints, it looks like 0724 is a continuation of 0424, resuming from the preannealing checkpoint. If so, what is the LR schedule for the continuation, and what is the additional dataset?

Suggest a potential alternative/fix

No response