nianticlabs / monodepth2

[ICCV 2019] Monocular depth estimation from a single image
Other
4.13k stars 953 forks source link

finetune on custom dataset with provided model #474

Open adam99goat opened 1 year ago

adam99goat commented 1 year ago

Hi, thanks for excellent work! Here I have trouble on custom dataset. I tried to train on my endoscopic monocular dataset and obtain a decent result. Furthermore, I am wondering if the result will be improved if a better pretrained model rather than ImageNet-1K is loaded. Actually, I have loaded the provided KITTI mono+stereo_640x192 pth files to finetune model on endoscopic monocular dataset. However, the model has suffered from degradation, which means the relative error is worse than the model finetuned on ImageNet-1K pth files. Specifically, I have tried two weight loading modes, which are both-encoder-decoder and encoder-only. Perhaps a KITTI-trained model is not suitable for endoscopic scenes. I am not sure if I loaded the weights incorrectly. If that's the case, would it be possible for you to provide me with some empirical advice? Look forward to your reply~

daniyar-niantic commented 1 year ago

Hi @adam99goat I agree that KITTI-trained model is not necessarily suitable for endoscopic images. Monodepth2 in most of experiments uses ImageNet-pretrained weights to initialize the encoder. The main reason is that it allows faster convergence of the network and, as a result, better scores. So, if you can try training a randomly initialized network (non-pretrained) and compare the results it would confirm that the weights loading is working.