lpiccinelli-eth / UniDepth

Universal Monocular Metric Depth Estimation
Other
588 stars 47 forks source link

How to effectively obtain the depth of the output with scale #39

Open mligg23 opened 4 months ago

mligg23 commented 4 months ago

I tried to estimate the distance of the object from the camera coordinates by using the depth result predicted by the model and the pixel region of the object on the RGB image, but I could not find the unit corresponding to the depth prediction result and whether it needed to be scaled. I hope the author can help me solve this doubt. Thank you

lpiccinelli-eth commented 4 months ago

The depth is metric, which means that the output numbers are meters.

As any metric depth estimator, the scale may not be perfect (as it is reciprocally related to the camera focal length). Moreover, out-of-domain data (for instance images of landscapes) do not belong to the training set, thus the model will fail to capture the depth correctly, i.e. the model thinks it is a miniature scene. This is due to the fact that the training data is mostly in the range 0-10 for indoor and 5-100 for outdoor.