DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
https://depth-anything-v2.github.io
Apache License 2.0
3.99k stars 345 forks source link

There is a large additive bias in the metric depth outputs #127

Open jbrownkramer opened 3 months ago

jbrownkramer commented 3 months ago

Hello,

I have been experimenting with the metric depth models that were fine-tuned with the Hypersim dataset.

I am running on data coming from an Azure Kinect. I have shifted true depth to be from the color camera persepective and undistorted both depth and RGB to have Hypersim intrinsics (intWidth, intHeight, fltFocal = 1024, 768, 886.81). I am comparing true depth to estimated metric depth and I notice that the closer the object, the worse its relative depth.

In the picture below, white indicate correct depth, red indicates that the estimate is too large, and blue indicates that it is too small.

amanda1

One way that this can happen is if there is a bias in the estimated depth. Indeed, here is a histogram of the difference true depth - estimated depth

image

The median difference was -866mm. When I subtract 866mm from the estimated depth, I get this:

amanda2

Furthermore, this constant difference continues to work as I point the camera at different scenes at different distances.

I cannot find the source of this bias in the training or inference code. I thought I would let you know in case you have any insight.