mk-minchul / AdaFace

MIT License
665 stars 122 forks source link

ValueError("Attempting to unscale FP16 gradients.") #127

Closed ayush0x00 closed 1 year ago

ayush0x00 commented 1 year ago

I was trying to train AdaFace on a custom dataset with 10k classes. When the model started to train, I got a ValueError(Attempting to unscale FP16 gradients). It's obvious that FP16 gradients can't be scaled and unscaling/scaling is handled internally by AMP but I am not able to find the root cause of the error. I have also attached a screenshot of the same.

Screenshot 2023-09-24 at 6 19 33 PM
ayush0x00 commented 1 year ago

Got resolved. Closing the issue.

afm215 commented 1 year ago

Can you please explain what was happening to help people who may meet this issue in the future?

ayush0x00 commented 1 year ago

I accidentally modified the original code PReLU layer to use a torch.Float16 datatype which was causing the issue. The original code didn't have any issues.