release of checkpoints of different steps

EleutherAI / pythia

The hub for EleutherAI's work on interpretability and learning dynamics

Apache License 2.0

2.23k stars 165 forks source link

release of checkpoints of different steps #101

Closed TobiasLee closed 1 year ago

TobiasLee commented 1 year ago

Hi, thanks for your great work!

I am investigating the emergent abilities of LLM training, and I'd like to request the internal checkpoints of Pythia-12b or Pythia-7b, i.e., checkpoints saved per 1k steps. Could u kindly upload the checkpoints to Huggingface for me?

haileyschoelkopf commented 1 year ago

Hi, thanks for your interest! Can you say more about what you're looking for?

We currently make all checkpoints (every 1k steps) available in Huggingface format in the repositories on the HF Hub: https://huggingface.co/EleutherAI/pythia-70m/tree/step103000 for example stores the 70m, step 103000 (and likewise for Pythia-12b and Pythia-7b).

If there is some analysis you want to do that requires either retraining or examining the optimizer states, more than happy to upload specific checkpoints upon request! However, we likely would not be able to upload every NeoX checkpoint + optimizer state to Huggingface due to storage constraints (optimizer states cause storage requirements to at least 6x). Does what you intend to do require the non-HF models?

haileyschoelkopf commented 1 year ago

Closing because we have HF-format models already public! Please don't hesitate to reopen if you need optimizer states or anything else. :)

jiahai-feng commented 7 months ago

Hi Hailey,

Not the OP, but I am in fact wondering if I could have access to the optimizer states. Having the optimizer state for the last checkpoint will be the most helpful, and having the optimizer states for ~10 evenly spaced checkpoints throughout training will be more than sufficient for me.

haileyschoelkopf commented 7 months ago

Hi @jiahai-feng , yes, I can get these uploaded for you this week!

haileyschoelkopf commented 7 months ago

Hi @jiahai-feng I've uploaded all optim states to neox-ckpt-pythia-160m-v1 and likewise for 160m-deduped! Will continue uploading more models' checkpoints.