What precision is used for model pre-training?

openlm-research / open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

Apache License 2.0

7.36k stars 374 forks source link

Closed haozhouamzn closed 1 year ago

haozhouamzn commented 1 year ago

Hello.

I am wondering what kind of precision strategy is applied during the pre-training?

Is it fp32, fp16, bf16 or mixed precision?

Thank you in advance

gjmulder commented 1 year ago

WHat was the answer, pls?

haozhouamzn commented 1 year ago

WHat was the answer, pls?