wenguanwang / DHF1K

Revisiting Video Saliency: A Large-scale Benchmark and a New Model (CVPR18, PAMI19)
139 stars 28 forks source link

The loss is nan. #10

Closed yufanLIU closed 5 years ago

yufanLIU commented 5 years ago

Hi, I'm really interested in your work. And I used your training code -- 'ACL_full' to train my data. But during training, the loss always becomes NAN after several iterations: 53/100 [==============>...............] - ETA: 59s - loss: nan - time_distributed_15_loss: nan - time_distributed_16_loss: nan

I have tuned the base learning rate from 1e-4 to 1e-12, but the results are the same.

Do you know there are some solutions?

And what does the 'imgs_path' ('staticimages') in config.py mean?

Thanks very much!

wenguanwang commented 5 years ago

You can add the following commands in the begin of each loss function:

y_true = K.clip(y_true, 0, 1) y_pred = K.clip(y_pred, K.epsilon(), 1-K.epsilon())

BTW, if you use your own training data, you should also check the value of all the training samples.

I think these two commands can solve your problem, but I'm very very busy and have no time to run the code. But I almost 100% confirm this is caused by the negative value of variants in the loss functions. If they don't work, just try to refine the loss functions to ensure all the values of variants ly in the correct scope.