Open mathfinder opened 3 months ago
The paper and the readme both say that 2.5 T tokens were trained. However, the corresponding config says 2 T tokens. ReadMe: https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/README.md?plain=1#L49 Config:
https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/configs/official/OLMo-7B.yaml#L74C1-L74C13
We had to make configuration tweaks mid-run in order to do training for more than 1 epoch (https://github.com/allenai/OLMo/issues/584). The 2.5T token count is accurate.
The paper and the readme both say that 2.5 T tokens were trained. However, the corresponding config says 2 T tokens. ReadMe: https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/README.md?plain=1#L49 Config:
https://github.com/allenai/OLMo/blob/26392798cbc4d9ac3898bd2949e77042220bf3f8/configs/official/OLMo-7B.yaml#L74C1-L74C13