Closed mridulk97 closed 6 years ago
Hi, The reprojected lidar points are known to have issues around occluders, that could explain why you do worse after a while. Also, why did you use a square root on the l2 loss?
So is it like I would not be able to perfectly overfit even for a single image and get zero error? And I was just trying if taking the square root on l2 loss might help, but it didn't make any difference.
I am not quite sure why the training error gets worse after a while. The surprising part is that all scores get worse.
Hello Mridulkhurana, I am planning to apply your idea of training the model in a supervised fashion with Kitti dataset. Did you manage to solve your problem and can you please provide me with the changes required in the code to perform this task. I tried already but for some reason I am not getting there. Thanks in advance
Hello, I was trying to fine tune the model on the Kitti Dataset in a supervised fashion. I train it only using the left image, get the disparities from your model, convert it into depth and compute the l2 loss with respect to the ground truth. Currently I'm using a
batch_size = 1
for a small dataset of 100 images . While training it for 1000 epochs just to check if the model is learning or not, the loss seems to be decreasing but errors calculated from the error metric provided starts increasing after some 500 epochs.Here is my command line:
python kitti/monodepth/exp_fine_tune_/monodepth_main.py --mode train --model_name check_epoch_1000_ --data_path /home/krishna/datasets/ --filenames_file /home/krishna/kitti/file_lists/kitti_train.txt --log_directory /home/krishna/kitti/temp --checkpoint_path /home/krishna/kitti/weights/model_city2kitti_resnet. --encoder resnet50 --retrain --full_summary --gt_filenames /home/krishna/kitti/file_lists/kitti_gt.txt
Code :
batch 2 | examples/s: 57.08 | loss: 3.06256 | time elapsed: 0.00h | time left: 0.14h batch 4 | examples/s: 57.14 | loss: 2.53113 | time elapsed: 0.00h | time left: 0.13h batch 6 | examples/s: 49.54 | loss: 2.20273 | time elapsed: 0.00h | time left: 0.13h
batch 500 | examples/s: 57.22 | loss: 0.64991 | time elapsed: 0.06h | time left: 0.07h batch 502 | examples/s: 56.05 | loss: 0.64934 | time elapsed: 0.06h | time left: 0.07h batch 504 | examples/s: 56.83 | loss: 0.64880 | time elapsed: 0.06h | time left: 0.07h
batch 996 | examples/s: 56.81 | loss: 0.56434 | time elapsed: 0.13h | time left: 0.01h batch 998 | examples/s: 57.42 | loss: 0.56336 | time elapsed: 0.13h | time left: 0.01h batch 1000 | examples/s: 57.33 | loss: 0.56274 | time elapsed: 0.13h | time left: 0.00h
On evaluating the model on the training images using the error metric after 500 epochs: abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.0513, 0.1279, 2.057, 0.086, 6.512, 0.979, 0.993, 0.998
On evaluating the model on the training images using the error metric after 1000 epochs: abs_rel, sq_rel, rms, log_rms, d1_all, a1, a2, a3
0.0541, 0.1564, 2.374, 0.091, 6.703, 0.972, 0.993, 0.998
Do you have any idea why I encounter this problems? Thank you. Mridul