Open adam99goat opened 1 year ago
Hi @adam99goat I agree that KITTI-trained model is not necessarily suitable for endoscopic images. Monodepth2 in most of experiments uses ImageNet-pretrained weights to initialize the encoder. The main reason is that it allows faster convergence of the network and, as a result, better scores. So, if you can try training a randomly initialized network (non-pretrained) and compare the results it would confirm that the weights loading is working.
Hi, thanks for excellent work! Here I have trouble on custom dataset. I tried to train on my endoscopic monocular dataset and obtain a decent result. Furthermore, I am wondering if the result will be improved if a better pretrained model rather than ImageNet-1K is loaded. Actually, I have loaded the provided KITTI mono+stereo_640x192 pth files to finetune model on endoscopic monocular dataset. However, the model has suffered from degradation, which means the relative error is worse than the model finetuned on ImageNet-1K pth files. Specifically, I have tried two weight loading modes, which are both-encoder-decoder and encoder-only. Perhaps a KITTI-trained model is not suitable for endoscopic scenes. I am not sure if I loaded the weights incorrectly. If that's the case, would it be possible for you to provide me with some empirical advice? Look forward to your reply~