Reproduce Depth Results ?

Rashfu commented 3 years ago

Hi, @ClementPinard : I wonder how you get the depth results in README.md. Because I use the pretrained model to get a relative worse reults. The command is as below:

python test_disp.py --pretrained-dispnet ./weights/pretrained/dispnet_model_best.pth.tar --dataset-dir /home/dell/dataset_raw/kitti_raw --dataset-list ./kitti_eval/test_files_eigen.txt

And I get the bad depth results:	abs_diff	abs_rel	sq_rel	rms	log_rms	abs_log	a1	a2	a3
7.4835	0.4603	4.8280	11.9583	0.5955	0.4745	0.2922	0.5444	0.7510

Rashfu commented 3 years ago

Now I figure it out that some code need to be fixed following the issue117

Reproduced Depth Results using pretrained model:	abs_diff	abs_rel	sq_rel	rms	log_rms	abs_log	a1	a2	a3
3.3295	0.1811	1.3400	6.2215	0.2613	0.1838	0.7328	0.9078	0.9640

Depth Result using my trained model using command:

python train.py --data /path/to/my_kitti_data/ -b8 -m0 -s2.0 --epoch-size 1000 --sequence-length 5 --log-output --with-gt

abs_diff	abs_rel	sq_rel	rms	log_rms	abs_log	a1	a2	a3
3.6801	0.1908	1.4855	7.0101	0.2777	0.1985	0.6943	0.8879	0.9576

How could you get such a good result ? Train the network many times and test every saved model to select a best one? After all, I just train the net once so I don't know if my trained results is acceptable. Could you please give me some advise?

ClementPinard commented 3 years ago

hello.You are right, I need to work on that Issue asap to update the code.

As for the best technique to get good results : It was mostly trial and error. The batch size I used was 4. You might need to play with that smooth loss, I had to tweak it a lot see comment in the README : https://github.com/ClementPinard/SfmLearner-Pytorch#differences-with-official-implementation

What you can also try is a gradient diffusion loss which is theoretically the most consistent with what should be a smooth depth map like. See it here : https://github.com/ClementPinard/unsupervised-depthnet/blob/master/loss_functions.py#L88. (the code from this repo is very similar to SfmLearner, so it shouldn't be too hard to adapt the loss for SfmLearner

Last but not least, you can "cheat" by keeping different network checkpoints and then test each of them in the test set. Practically, it means that you replace the validation set by the test set. That is what was done in the original repo, where it was observed that performance worsened after X epochs.

Rashfu commented 3 years ago

OK, I get it. The gradient diffusion loss will be tested afterwards.

Another question puzzling me long time is about data preparation for training. It is often written in papers that we train and test models on KITTI raw dataset using Eigen's split.

But from some papers and relative offical repo， I found that there are several splits of KITTI:

Eigen split // Eigen zhou which use 19905 training pairs and 2212 valid pairs. (in monodepth2 and monodepth2++, H-net)
Eigen full which use 22600 training pairs and 888 valid pairs. (in monodepth, monodepth2 , monodepth2++, H-net)
KITTI split which use 29000 training pairs and 1159 valid pairs. (in monodepth )

But in your SFMlearner repo and SC-SFMlearner, you choose 50 scenes and randomly select 43 for training and 7 for validation. So it seems that the KITTI dataset split doesn't match three methods above. What is connections between all these methods ? Thank a lot!

ClementPinard commented 3 years ago

We still use Eigen Split for testing. The difference is that we use a validation split instead of using the test set with every model checkpoint to know when the network is beginning to worsen. As such, we have a train / val / test split, with test being always the same.

You can see how we store it in this repo : The eigen test files are stored in this list : https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/kitti_eval/test_files_eigen.txt However, we need a way to dismiss all the scenes which contains images of the eigen split (otherwise some training mage could too close to testing images), hence this file : https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/data/test_scenes.txt This file can be constructed very easily with the first one since it's only the list of folders that we can find in eigen split file paths.

Once that the test set is isolated, we construct the validation set with random select, at 80%/20% ratio, but it's up to you as long as you don't use test scenes for training or validation. An important detail is also that the bigger your validation set is, the closer you are from an hypothetical real usecase where you have a bunch of Lidar-enabled images available, and your workflow would be to optimize the hyperparameters on the validation, while hoping for the best for testing which would be real incoming data without any groundtruth available.

Hope the last part was clear, the thing is most splits still use the test set as a validation one which is not the right method because then your hyperparameters will be biased toward the test set. If you really want to know how good your algorithm is on KITTI, you need to use one provided by KITTI depth and see your score on the leaderboard : http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_prediction

ClementPinard / SfmLearner-Pytorch

Reproduce Depth Results ? #129