EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics
Apache License 2.0
2.16k stars 156 forks source link

Would it be possible to share training loss curves on the original Pythia models? #145

Closed itsnamgyu closed 6 months ago

StellaAthena commented 6 months ago

We are working on collecting them. We didn't have WandB configured in a way that makes than easy unfortunately, and are struggling to clean it up after the fact.

If you're okay with only having the ones for small models, @oskarvanderwal is retraining some of them with different random seeds and (I think) has better logging set up.

itsnamgyu commented 6 months ago

Thanks. @oskarvanderwal would it be possible to share the pre-training loss curves of smaler Pythia models?

oskarvanderwal commented 6 months ago

Hi @itsnamgyu, we are actually collecting all the loss curves for the smaller Pythia models (14m, 31m, 70m, 160m, 410m) for different seeds, and we'll share them on the Pythia github once finished. Note: these are for the non-deduped training corpus.

In the mean time, you can find some of these curves here (not the original ones from the paper): https://wandb.ai/eleutherai/pythia-extra-seeds/reports/Some-loss-curves-for-smaller-Pythia-models--Vmlldzo2NTkxNDIw

Be aware that if we had to stop and continue the training of a particular model (e.g., because of run priority), WandB logs these as separate runs!

The original loss curves are in this WandB project: https://wandb.ai/eleutherai/pythia But these logs are much harder to navigate. Again, we are collecting these for the smaller models as well and will share these in a CSV file on this github repo.

itsnamgyu commented 6 months ago

Thanks! This will be a huge help.