the loss of pythia training

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Apache License 2.0

2.23k stars 165 forks source link

the loss of pythia training #97

Closed Wangpeiyi9979 closed 1 year ago

Wangpeiyi9979 commented 1 year ago

Hi, Is there any information about the loss of Pythia pre-training process like LLaMA?

haileyschoelkopf commented 1 year ago

https://wandb.ai/eleutherai/pythia?workspace=user-schoelkopf

Hi! We have a public wandb board (linked above) for our training runs. if you filter this by runs without "crashed" status and runs that lasted longer than 1 hour, then runs named "v2-MODELSIZE" should be what you want (I can help point out specific runs if needed, it is on my todo list to clean this up.)

Wangpeiyi9979 commented 1 year ago

Thanks

xiaoda99 commented 11 months ago

Hi, I can only find runs for 160M model in the above wandb link. And loading the many pages is very slow. Could you provide a cleaned versoin of runs for v2 6.9B and 12B (withou dedup)? I'm doing experiments on some new transformer architectures and want to compare my training loss results with Pythia's