TRI-ML / packnet-sfm

TRI-ML Monocular Depth Estimation Repository
https://tri-ml.github.io/packnet-sfm/
MIT License
1.24k stars 243 forks source link

Query regarding Self Supervised learning model #31

Closed Kartik17 closed 4 years ago

Kartik17 commented 4 years ago

Hi, thanks for uploading the code. I had a few queries:

  1. As we are specifying the camera intrinsics so I believe the pre-trained model won't give good results on datasets other than which they are trained?
  2. Will the model learn depth in a scenario where there are very few distinct objects in the scene and environment that look like a repeating texture, eg: desert, farms, etc? Are these the situations or scenario where the model can benefit from some sort of supervision(Semi-supervised)
  3. For the Custom Image model, since in config YAML file, we are specifying the resolution of the image, do we need to account for the new resolution in the dummy camera matrix manually or there is function somewhere which premultiplies the camera matrix with the scale matrix.
VitorGuizilini-TRI commented 4 years ago

1) Yes, usually we need some fine-tuning on new datasets, so the pre-trained features can align with the new camera configurations. Creating camera-agnostic depth networks is a very interesting topic, though! 2) Depth accuracy degrades with distance, so far away objects like those found in the scenes you mentioned will suffer more. Repeating patterns are also an issue, the photometric loss will be ambiguous, so semi-supervision will help more in these cases. 3) We assume that the input camera matrix corresponds to the original image resolution, and it's automatically scaled during resizing.

Kartik17 commented 4 years ago

Thanks for your quick reply.

tjdahlke commented 3 years ago

Great questions @Kartik17.