microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.08k stars 2.43k forks source link

BEATs model produces NaN when using mixed precision with pytorch lightning #1569

Open tcourat opened 3 weeks ago

tcourat commented 3 weeks ago

I do not know if this is an expected behavior, but I was not able to train BEATs using pytorch lightning with mixed-precision as it was producing NaN values when using the extract_features method.

Is there a fix for that ? Or am I contrained to full float32 training ?