Open yuvalkirstain opened 2 years ago
Thank you for this issue! We are currently working on adding support for bf16 and hope to have it done very soon :)
Assuming that you meant support bf16 with FSDP? Or were you thinking of another API?
Exactly, bf16 with FSDP!
@anj-s please let me know if there is anything we can do to help, having support for bf16 with FSDP in Fairseq will really really help us! :)
Hi, has there been any progress with resolving this issue? @anj-s Thank you so much
Hi, has there been any progress with resolving this issue? @anj-s Thank you so much
Hi @yuvalkirstain, I think this should work without any issues. Can you try using bfloat16 by passing the right compute_dtype argument when using FSDP? Unfortunately i haven't had a chance to add a unit test but perhaps someone else on the team has looked into this. cc @anupambhatnagar @min-xu-ai
bfloat16 support with pytorch lighting will be better, do you have this consideration?
Is there currently any progress on this issue? Or I'm just wondering if it works if I just apply the above branch.
There has been no progress on this so far.
Feature Request
Please support BF16 mixed-precision
Additional context
Training with BF16 is usually more stable than fp16, which is very important when we want to train large models. Additionally, many models (e.g. T5) are trained with BF16 and if we want to continue training them with mixed-precision, using fp16 will result in NaNs.
Thank you!