Loss skyrockets until NaN even with almost-zero learning rate.

I am using the Tensorflow 0.12 version, on the augmented dataset. The loss always skyrockets to Inf and then NaN, even when I set the learning rate to zero (1e-15). I even test with the "debug.txt" file and the misc folder images and get the following output for the first 5 steps of train.py, using deeplab_resnet.ckpt:

step 0 loss = 1.664, (3.926 sec/step) step 1 loss = 3494010119258112.000, (0.555 sec/step) step 2 loss = 7812047707534524416.000, (0.256 sec/step) step 3 loss = 22405511500261228544.000, (0.255 sec/step) step 4 loss = 35339726484766982144.000, (0.256 sec/step)

Running fine_tune.py returns similar results. deeplab_resnet_init.ckpt behaves exactly the same:

step 0 loss = 4.739, (3.675 sec/step) step 1 loss = 14367365719148986368.000, (0.560 sec/step) step 2 loss = 43311187985166237696.000, (0.256 sec/step) step 3 loss = 72587321148004368384.000, (0.260 sec/step) step 4 loss = 94695676436719599616.000, (0.262 sec/step)

My batch size is 1 to avoid the OOM error.

Any ideas what is causing this mystery? I am sure there is nothing wrong with the dataset, as the same happens with the debug images.

DrSleep / tensorflow-deeplab-resnet

Loss skyrockets until NaN even with almost-zero learning rate. #180