openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
Apache License 2.0
7.29k stars 372 forks source link

is there plans to train it up to 1 trillion tokens? #30

Closed ninjasaid2k closed 1 year ago

ninjasaid2k commented 1 year ago

I am curious about the training scale of the Llama language model, is there are any plans to train it up to 1 trillion tokens?

LachlanGibson commented 1 year ago

According to the 05/22/2023 update in the readme: "We expect the full 1T token training run to finish at the end of this week."

In #26 someone said they were planning to train beyond 1T tokens but "We are still figuring out the right data mixture for that."

ninjasaid2k commented 1 year ago

is the training completed or is there any issues with the training?

AhmedBytesBits commented 1 year ago

is the training completed or is there any issues with the training?

@ninjasaid2k as stated somewhere here, it is expected to be out for both 7b and 3b on coming Monday

young-geng commented 1 year ago

Yes. We are planning to release both the 7b and 3b model on Monday.

mhenrichsen commented 1 year ago

@young-geng any updates? Super stoked for the release :)

young-geng commented 1 year ago

We've just released the 1T token final version for the 3B and 7B models. We've also released a 600B token preview for the 13B model.