isl-org / MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
MIT License
4.25k stars 597 forks source link

How to transform raw labeled data to training data? #232

Open VillardX opened 11 months ago

VillardX commented 11 months ago

Hi, thanks for your great work! I am from your subsequent work VPT and have visited issues about how to estimate scale and shift to transform model prediction relative depth to metric depth via zero-shot procedure.

I just wonder how to transform raw labeled data to training data. More precisely, when training the DPT model from scratch, how to transform the raw labeled metric depth to the relative depth which DPT aimed to predict?

For example, I have an image $img$ with $M$ pixels and corresponding metric depth . $pix_i$ 's depth is $depth_i$ (meters), which is a absolute depth. And how can I transform $depth_i$ to relative depth which the DPT model is actually aimed to predict? And how can I get the corresponding scale and shift? For training data, do scale and shift change in different images?

Would you kindly give me some hints? Thanks.