Loss becomes nan after 1k training steps

MahmudulAlam / Automatic-Identification-and-Counting-of-Blood-Cells

Machine learning approach of automatic identification and counting of blood cells (RBC, WBC, and Platelet) with KNN and IOU based verification.

GNU General Public License v3.0

133 stars 53 forks source link

Hi, thank you for your great work!

I run this experiments according to the wiki step by step. I did not change any hyper-parameters except that I set gpu=0 to use a cpu for training. But I found that the loss became nan just after 1K steps training.

...
step 1655 - loss        nan - moving ave loss        nan
step 1656 - loss        nan - moving ave loss        nan
Finish 92 epoch(es)
step 1657 - loss        nan - moving ave loss        nan
step 1658 - loss        nan - moving ave loss        nan
...

Do you have any idea or have you ever observed this kind of strange thing? Thank you in advance for your help!

MahmudulAlam / Automatic-Identification-and-Counting-of-Blood-Cells

Loss becomes nan after 1k training steps #15