Closed haozhouamzn closed 1 year ago
Hello.
I am wondering what kind of precision strategy is applied during the pre-training of OpemLlama?
Is it fp32, fp16, bf16 or mixed precision?
Thank you in advance
We used dtype='fp32', since for small models using fp16 does not give much speed improvement
dtype='fp32'
fp16
Thanks Young. Just to confirm, it's using fp32 for all three models (3B, 7B and 13B)?
It was fp32 for all of them
Hello.
I am wondering what kind of precision strategy is applied during the pre-training of OpemLlama?
Is it fp32, fp16, bf16 or mixed precision?
Thank you in advance