is there plans to train it up to 1 trillion tokens?

openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Apache License 2.0

7.29k stars 372 forks source link

Closed ninjasaid2k closed 1 year ago

ninjasaid2k commented 1 year ago

I am curious about the training scale of the Llama language model, is there are any plans to train it up to 1 trillion tokens?

LachlanGibson commented 1 year ago

According to the 05/22/2023 update in the readme: "We expect the full 1T token training run to finish at the end of this week."

In #26 someone said they were planning to train beyond 1T tokens but "We are still figuring out the right data mixture for that."

ninjasaid2k commented 1 year ago

is the training completed or is there any issues with the training?

AhmedBytesBits commented 1 year ago

is the training completed or is there any issues with the training?

@ninjasaid2k as stated somewhere here, it is expected to be out for both 7b and 3b on coming Monday

young-geng commented 1 year ago

Yes. We are planning to release both the 7b and 3b model on Monday.

mhenrichsen commented 1 year ago

@young-geng any updates? Super stoked for the release :)

young-geng commented 1 year ago

We've just released the 1T token final version for the 3B and 7B models. We've also released a 600B token preview for the 13B model.