Closed TobiasLee closed 1 year ago
Hi, thanks for your interest! Can you say more about what you're looking for?
We currently make all checkpoints (every 1k steps) available in Huggingface format in the repositories on the HF Hub: https://huggingface.co/EleutherAI/pythia-70m/tree/step103000 for example stores the 70m, step 103000 (and likewise for Pythia-12b and Pythia-7b).
If there is some analysis you want to do that requires either retraining or examining the optimizer states, more than happy to upload specific checkpoints upon request! However, we likely would not be able to upload every NeoX checkpoint + optimizer state to Huggingface due to storage constraints (optimizer states cause storage requirements to at least 6x). Does what you intend to do require the non-HF models?
Closing because we have HF-format models already public! Please don't hesitate to reopen if you need optimizer states or anything else. :)
Hi Hailey,
Not the OP, but I am in fact wondering if I could have access to the optimizer states. Having the optimizer state for the last checkpoint will be the most helpful, and having the optimizer states for ~10 evenly spaced checkpoints throughout training will be more than sufficient for me.
Hi @jiahai-feng , yes, I can get these uploaded for you this week!
Hi @jiahai-feng I've uploaded all optim states to neox-ckpt-pythia-160m-v1
and likewise for 160m-deduped! Will continue uploading more models' checkpoints.
Hi, thanks for your great work!
I am investigating the emergent abilities of LLM training, and I'd like to request the internal checkpoints of Pythia-12b or Pythia-7b, i.e., checkpoints saved per 1k steps. Could u kindly upload the checkpoints to Huggingface for me?