Why use max_depth in metric depth prediction?

DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

https://depth-anything-v2.github.io

Apache License 2.0

3.99k stars 345 forks source link

Why use max_depth in metric depth prediction? #119

Open 1punch3coins opened 3 months ago

1punch3coins commented 3 months ago

https://github.com/DepthAnything/Depth-Anything-V2/blob/31dc97708961675ce6b3a8d8ffa729170a4aa273/metric_depth/depth_anything_v2/dpt.py#L183 Would this line lead the model into relative depth prediction? As the max_depth is no more than a scale factor. Also, when I am fine tuning on my own dataset, how do I set the correct value? If I set the max_depth for each sample by online calculating its max depth, then how do I set the value while doing inference?

LiheYoung commented 3 months ago

You should pre-define a global max_depth, rather than dynamically adjust it for each sample. This is because our output in metric depth estimation is in meters, which has a physical meaning. If you adjust the max_depth for each sample, then the self.depth_head(features, patch_h, patch_w) will become a relative depth within this specific sample.

shilpaullas97 commented 3 months ago

Hi @LiheYoung ,

I’m looking to get more information about the max_depth parameter. Additionally, I have a question about interpreting depth prediction values from relative versus metric depth models. Specifically, how can I verify the depth values from a metric depth model if I don’t have ground truth data?

Could you also provide guidance on what steps to follow if I want to train a model on a custom dataset (outdoor data)?

1punch3coins commented 3 months ago

I found this guide https://huggingface.co/blog/Isayoften/monocular-depth-estimation-guide is useful, maybe it can help you.