Training with mixed precision is returning NaN

plemeri / InSPyReNet

Official PyTorch implementation of Revisiting Image Pyramid Structure for High Resolution Salient Object Detection (ACCV 2022)

MIT License

497 stars 73 forks source link

Training with mixed precision is returning NaN #10

Closed darkbobin closed 1 year ago

darkbobin commented 1 year ago

Thank you for your great work

When using mixed_precision: True https://github.com/plemeri/InSPyReNet/blob/bfe08191800dc0d7212a931052e8ef4b679d7db7/configs/InSPyReNet_SwinB.yaml#L39, the model is returning NaN

Do you have an idea how to fix this issue? Thanks

plemeri commented 1 year ago

Hi, thank you for your interest in our work. We tried using mixed_precision to speed up training, but ended up with not using it because it resulted much worse performance compared to the default setting. It was used in the early stage of our research, so we recommend not to enable it.

Thank you.

darkbobin commented 1 year ago

Thank you for your answer. Any heads up on the performances of the model after conversion into fp16 after training though ?

plemeri commented 1 year ago

We're planning to release a ONNX runtime for our command-line tool / python API transparent-background powered by our InSPyReNet. Stay tuned for the updates!