NaN loss on CamVid Binary Examples

qubvel / segmentation_models

Segmentation models with pretrained backbones. Keras and TensorFlow Keras.

MIT License

4.77k stars 1.03k forks source link

NaN loss on CamVid Binary Examples #446

Closed 1412kaito closed 2 years ago

1412kaito commented 3 years ago

Hello,

I am trying out your example provided here on 2 different local machine, for simplicity sake let's say A and B.

On machine A, the code runs (converted it to a script, and removed the debugging output) and I got a reasonable loss and IoU. On machine B, the code runs, but in the middle of the first epoch the loss and IoU turns NaN.

Anyone has any idea why is this happening? Or perhaps how to debug this? Both machine runs Windows 10, python 3.7 with tensorflow-gpu installed using conda.

1412kaito commented 2 years ago

closing the issue; I believe I could not find the root cause, but running the samples in Linux was consistent and I played around on a Linux environment instead