isl-org / MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
MIT License
4.56k stars 634 forks source link

Wrong predictions? #38

Closed aizuon closed 4 years ago

aizuon commented 4 years ago

I did a test run with a book on a flat surface. On the output the background depth isn't uniform. Is there something I'm missing? image

ranftlr commented 4 years ago

While the model is robust, it doesn't provide correct predictions in all conceivable circumstances - this is an open research problem.

The sample that you provided here is very challenging. There are few depth cues and no context in the image. The scene is uncommon or even non-existent in the datasets that the model was trained on. I wouldn't expect MiDaS (or any existing monocular depth estimation model) to work well for this image. A typical image where MiDaS works well in most cases is your average vacation photo, a photo of a room or street scene, a photo showing people or animals, etc. We also discuss some common failure cases in the supplementary material of our paper.