jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Apache License 2.0
7.61k stars 444 forks source link

Release format + artefact #39

Closed PierreColombo closed 11 months ago

PierreColombo commented 11 months ago

Dear Authors, Thanks so much for your amazing project. Would it be possible for you plan to release the following:

  1. the optimizer states
  2. the scheduler
  3. a checkpoint just before cooling down the model

This would be a highly valuable artefact for keeping training the model !

Thanks so much and congratulation for your work ! Pierre

jzhang38 commented 11 months ago

We will upload all intermediate checkpoints (one every 10B tokens) to HuggingFace and ModelScope soon.

Once we finish uploading the model weights we will consider uploading the optimizer states.

PierreColombo commented 11 months ago

Thanks

jzhang38 commented 11 months ago

All intermediate checkpoints is here: https://huggingface.co/TinyLlama/tinyLlama-intermediate-checkpoints/tree/step-480k-token-1007B