mrharicot / monodepth

Unsupervised single image depth prediction with CNNs
Other
2.23k stars 631 forks source link

Trying to Fine-Tune on the Kitti dataset, but the error values are not converging #172

Closed mridulk97 closed 6 years ago

mridulk97 commented 6 years ago

Hello, I was trying to fine tune the model on the Kitti Dataset in a supervised fashion. I train it only using the left image, get the disparities from your model, convert it into depth and compute the l2 loss with respect to the ground truth. Currently I'm using a batch_size = 1 for a small dataset of 100 images . While training it for 1000 epochs just to check if the model is learning or not, the loss seems to be decreasing but errors calculated from the error metric provided starts increasing after some 500 epochs.

Here is my command line:

python kitti/monodepth/exp_fine_tune_/monodepth_main.py --mode train --model_name check_epoch_1000_ --data_path /home/krishna/datasets/ --filenames_file /home/krishna/kitti/file_lists/kitti_train.txt --log_directory /home/krishna/kitti/temp --checkpoint_path /home/krishna/kitti/weights/model_city2kitti_resnet. --encoder resnet50 --retrain --full_summary --gt_filenames /home/krishna/kitti/file_lists/kitti_gt.txt

Code :

model = MonodepthModel(params, 'test', left, right)
with tf.variable_scope(tf.get_variable_scope()):

            pred_after_tf = model.disp_left_est[0]
            pred_after_tf_resize = tf.image.resize_bilinear(pred_after_tf, (height, width))

            # gt_depth is the ground truth image.
            mask = gt_depth > 0
            mask = tf.to_float(mask)

            pred_disp = tf.scalar_mul(tf.to_float(width), tf.squeeze(pred_after_tf_resize[0]))
            pred_depth = tf.divide(width_to_focal[width],pred_disp)

            pred_depth_masked = tf.multiply(pred_depth, mask)
            pred_depth_masked = tf.divide(pred_depth_masked, 256.0)

            loss = tf.nn.l2_loss(tf.subtract(gt_depth , pred_depth_masked))
            loss = tf.sqrt(loss)
            grads = opt_step.compute_gradients(loss)

apply_gradient_op = opt_step.apply_gradients(grads, global_step=global_step)

batch 2 | examples/s: 57.08 | loss: 3.06256 | time elapsed: 0.00h | time left: 0.14h batch 4 | examples/s: 57.14 | loss: 2.53113 | time elapsed: 0.00h | time left: 0.13h batch 6 | examples/s: 49.54 | loss: 2.20273 | time elapsed: 0.00h | time left: 0.13h

batch 500 | examples/s: 57.22 | loss: 0.64991 | time elapsed: 0.06h | time left: 0.07h batch 502 | examples/s: 56.05 | loss: 0.64934 | time elapsed: 0.06h | time left: 0.07h batch 504 | examples/s: 56.83 | loss: 0.64880 | time elapsed: 0.06h | time left: 0.07h

batch 996 | examples/s: 56.81 | loss: 0.56434 | time elapsed: 0.13h | time left: 0.01h batch 998 | examples/s: 57.42 | loss: 0.56336 | time elapsed: 0.13h | time left: 0.01h batch 1000 | examples/s: 57.33 | loss: 0.56274 | time elapsed: 0.13h | time left: 0.00h

On evaluating the model on the training images using the error metric after 500 epochs: abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.0513, 0.1279, 2.057, 0.086, 6.512, 0.979, 0.993, 0.998

On evaluating the model on the training images using the error metric after 1000 epochs: abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.0541, 0.1564, 2.374, 0.091, 6.703, 0.972, 0.993, 0.998

Do you have any idea why I encounter this problems? Thank you. Mridul

mrharicot commented 6 years ago

Hi, The reprojected lidar points are known to have issues around occluders, that could explain why you do worse after a while. Also, why did you use a square root on the l2 loss?

mridulk97 commented 6 years ago

So is it like I would not be able to perfectly overfit even for a single image and get zero error? And I was just trying if taking the square root on l2 loss might help, but it didn't make any difference.

mrharicot commented 6 years ago

I am not quite sure why the training error gets worse after a while. The surprising part is that all scores get worse.

firasomran01 commented 6 years ago

Hello Mridulkhurana, I am planning to apply your idea of training the model in a supervised fashion with Kitti dataset. Did you manage to solve your problem and can you please provide me with the changes required in the code to perform this task. I tried already but for some reason I am not getting there. Thanks in advance