How to execute eagerly to find nan-loss cause?

experiencor / keras-yolo3

Training and Detecting Objects with YOLO3

MIT License

1.61k stars 861 forks source link

Hi! I've been using this repo on my own dataset and I have encountered the problem with the loss suddenly hitting nan, even though it was converging nicely before (as in #198 ) After printing some things in the tensorflow graph I'm quite sure the error comes from weird values on box width and height, but I haven't managed to pinpoint it.

To check it I thought I'd try running the program eagerly with tf.compat.v1.enable_eager_execution() but it results in the error 'get_session' is not available when TensorFlow is executing eagerly.

Is it either possible to run it eagerly in some way or has anyone figured out the reason for the sudden nan-loss?

experiencor / keras-yolo3

How to execute eagerly to find nan-loss cause? #306