Closed itsnamgyu closed 6 months ago
Thanks. @oskarvanderwal would it be possible to share the pre-training loss curves of smaler Pythia models?
Hi @itsnamgyu, we are actually collecting all the loss curves for the smaller Pythia models (14m, 31m, 70m, 160m, 410m) for different seeds, and we'll share them on the Pythia github once finished. Note: these are for the non-deduped training corpus.
In the mean time, you can find some of these curves here (not the original ones from the paper): https://wandb.ai/eleutherai/pythia-extra-seeds/reports/Some-loss-curves-for-smaller-Pythia-models--Vmlldzo2NTkxNDIw
Be aware that if we had to stop and continue the training of a particular model (e.g., because of run priority), WandB logs these as separate runs!
The original loss curves are in this WandB project: https://wandb.ai/eleutherai/pythia But these logs are much harder to navigate. Again, we are collecting these for the smaller models as well and will share these in a CSV file on this github repo.
Thanks! This will be a huge help.
We are working on collecting them. We didn't have WandB configured in a way that makes than easy unfortunately, and are struggling to clean it up after the fact.
If you're okay with only having the ones for small models, @oskarvanderwal is retraining some of them with different random seeds and (I think) has better logging set up.