Runtime Error: Loss is nan

Hey, While i was training ghn and mlp models, at around 220 epochs, i had the following error: error <class 'RuntimeError'> the loss is nan, unable to proceed. Do you have any solution for this?

Error Message: error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed error <class 'RuntimeError'> the loss is nan, unable to proceed Out of patience (after 15 attempts to continue), please restart the job with another seed !!! Traceback (most recent call last): File "/ppuda/experiments/train_ghn.py", line 168, in main() File "/ppuda/experiments/train_ghn.py", line 105, in main loss = trainer.update(nets_torch, images, targets, ghn=ghn, graphs=graphs) File "/ppuda/../ppuda/ppuda/utils/trainer.py", line 101, in update raise RuntimeError('the loss is {}, unable to proceed'.format(loss)) RuntimeError: the loss is nan, unable to proceed

facebookresearch / ppuda

Runtime Error: Loss is nan #4