LiheYoung / Depth-Anything

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
https://depth-anything.github.io
Apache License 2.0
6.84k stars 525 forks source link

Metric depth estimation for "cropped" KITTI #85

Open hgolestaniii opened 8 months ago

hgolestaniii commented 8 months ago

Hi @LiheYoung,

Thanks again for your great work. I have an issue with using "pre-trained" outdoor metric-depth for a resolution different than the original KITTI resolution, i.e., 1216x352. The produced depth seems to be scaled/mapped compared to the ground truth. In the following, I explain it a bit more.

1- I assured that I can run the pipeline for KITTI evaluation dataset (available here), including 1000 images + ground truth depth data, resolution: 1216x352. Through "evaluate.py", it generates reasonable depth images with a pretty good RMSE values (~2m)

2- I cropped the same dataset and get smaller images with 512x288 resolution (only crop, no resize). I then defined a new dataset, called kitti_512x288, in config.py, and also modified data_mono.py in order to get rid of "do_kb_crop". The outputted depth seems still reasonable; however, it has pretty different values compared to the GT 16-bit depth data. It seems cropping cause this issue.

I have attached my input RGB, its corresponding GT depth and my estimated depth. Could you please take a look and tell me why I get wrong estimated depth values for this "crop" test?

2011_10_03_drive_0047_sync_image_0000000005_image_03 2011_10_03_drive_0047_sync_groundtruth_depth_0000000005_image_03 0_pred

hgolestaniii commented 7 months ago

Hi @LiheYoung I ran more experiments, here is my conclusion: "If you crop the Kitti dataset and feed it into Metric Depth-AnyThing (indoor model), you will get wrong depth values, compared to the GT." Why is it the case? How to fix it? Is the true for all depth estimation algorithm based on Zoe?

EricLee0224 commented 1 month ago

@LiheYoung same question.