Getting NaN loss after few epochs ~30

openai / deeptype

Code for the paper "DeepType: Multilingual Entity Linking by Neural Type System Evolution"

https://arxiv.org/abs/1802.01021

Other

647 stars 147 forks source link

Getting NaN loss after few epochs ~30 #41

Closed heisenbugfix closed 6 years ago

heisenbugfix commented 6 years ago

Did anyone face the same problem? I get a message "Loss is NaN". Any leads on how to resolve this?

loss is NaN. Exception ignored in: <generator object prefetch_generator at 0x7f8d1b884780> Traceback (most recent call last): File "/mnt/research-6f/aranjan/dtype/learning/generator.py", line 29, in prefetch_generator t.join() File "/usr/lib/python3.5/threading.py", line 1051, in join raise RuntimeError("cannot join current thread") RuntimeError: cannot join current thread

JonathanRaiman commented 6 years ago

I've run into this a couple times, and the usual fix was: lower the learning rate, or reorder the data (e.g. some examples lead to high kl loss -> divergence, so reseeding/restarting can also fix things).