Any plan for OpenLLaMA-65B？

openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Apache License 2.0

7.27k stars 370 forks source link

Any plan for OpenLLaMA-65B？ #71

Closed JingxinLee closed 1 year ago

JingxinLee commented 1 year ago

Do you have a plan for releasing OpenLLaMA-65B？ Maybe met the GPUs amount bottleneck? Thanks

young-geng commented 1 year ago

Unfortunately we don't have the compute resources to do that.

JingxinLee commented 1 year ago

Unfortunately we don't have the compute resources to do that.

So how many H100/H800 GPUs do you need to train a OpenLLaMA-65B model? May I ask the minimum number?

young-geng commented 1 year ago

I believe you will need at least 1000 of them to finish the training within a month.

JingxinLee commented 1 year ago

If I don't mind the amount of month to train, for example, I have 5 months to train a 65B model, can I say that I need at least 1000/5=200 GPUs? If I have 10 months to train, can I say I need at least 1000/10=100 GPUs? Thanks

young-geng commented 1 year ago

That sounds about right.