Why my train loss equal to nan?

InsaneLife / dssm

DSSM and Multi-View DSSM

660 stars 230 forks source link

Why my train loss equal to nan? #17

Open bingoohe opened 5 years ago

bingoohe commented 5 years ago

Hi, When I run dssm_rnn.py, the train loss always shows nan. Change learning rate, no matter what. I print out the variables in the model, and the variable embedding in the word_embeddings_layer shows nan for the first time. How to deal with it. Thanks!

InsaneLife commented 5 years ago

loss = -tf.reduce_sum(tf.log(hit_prob)) should add a minimal number like loss = -tf.reduce_sum(tf.log(hit_prob + 1e-8))

bingoohe commented 5 years ago

The log function is a reason. There is also a case where I will get 0 when calculating the norm.