Closed rehulisw closed 1 year ago
Thanks for your work! I try to reproduce the base model for Autoformer, but met the problem that loss might be nan during 200th ~ 300th epochs. Do you have any idea to solve this problem?
Hi @rehulisw , thanks for your attention to our work!
You can try to disable AMP by adding the argument --no-amp, then resume the checkpoint in around the epoch 200th.
--no-amp
Thanks for your work! I try to reproduce the base model for Autoformer, but met the problem that loss might be nan during 200th ~ 300th epochs. Do you have any idea to solve this problem?