Open dnflslwlq opened 1 month ago
Hi, the results reported in the paper are trained on NYU or KITTI training sets. But our released metric depth models are trained on Hypersim or Virtual KITTI training sets.
If you want NYU or KITTI-trained metric depth models, you can refer to the released models in Depth Anything V1: https://github.com/LiheYoung/Depth-Anything/tree/main/metric_depth
@LiheYoung Thank you for your response.
Hi @LiheYoung , I also want to know the training set of vkitti. In training code, Why the valset of vkitti2 is Kitti instead of using test set of vkitti2?
hi everyone,
I attempted to measure the metric depth performance on the KITTI datasets using the pretrained model provided in this project. (δ1, δ2, δ3, AbsRel, RMS, log10)
but the result is far different from the paper's value,
my result :
After trying with vitl, vitb, and vits, the delta1 value was 0.8xx, which falls far short of the values in the paper. Similarly, the RMSE was in the range of 4.xxx compared to 3.xxx.
I used the following methods:
Method 1:
Test dataset: Selected a sequence from the KITTI datasets. Obtained the depth_pred value and the ground truth image converted to meters. Clipped the GT values between 0 and 80. Valid_mask = gt > 0 (since the devkit indicates pixel value 0 as invalid). Applied the valid mask to both depth_pred and gt. Converted both to tensors. Used eval_depth from metric_depth/util/metric.py to calculate d1, d2, d3, abs_rel, rmse, etc.
Method 2:
Test dataset: metric_depth/dataset/splits/kitti/val.txt Loaded the data using metric_depth/dataset/kitti.py The rest of the process is identical to Method 1.
Method 3:
Test dataset: Downloaded manually selected validation and test datasets (2 GB) from the KITTI website. The rest of the process is identical to Method 1. Applying garg_crop at each step did not help. Although the performance slightly improved, it still fell far short of the values in the paper.
Could you please let me know how you measured the metrics?