danbraunai / simple_stories_train

Trains small LMs. Designed for training on SimpleStories
3 stars 1 forks source link

Add checkpoints throughout training #3

Closed danbraunai closed 2 months ago

danbraunai commented 2 months ago

Should happen using a log-based scheduler.

Can use a similar structure as is done in Pythia:

To promote research on the learning dynamics of LLMs we make 154 checkpoints available for each model, representing steps 0 (initialization), 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1000, and then every 1,000 subsequent steps.