Wrong predictions? - Githubissues

While the model is robust, it doesn't provide correct predictions in all conceivable circumstances - this is an open research problem.

The sample that you provided here is very challenging. There are few depth cues and no context in the image. The scene is uncommon or even non-existent in the datasets that the model was trained on. I wouldn't expect MiDaS (or any existing monocular depth estimation model) to work well for this image. A typical image where MiDaS works well in most cases is your average vacation photo, a photo of a room or street scene, a photo showing people or animals, etc. We also discuss some common failure cases in the supplementary material of our paper.

isl-org / MiDaS

Wrong predictions? #38