Estimating model on KITTI (or other depth dataset)

isl-org / MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"

MIT License

4.41k stars 617 forks source link

Estimating model on KITTI (or other depth dataset) #101

Open RuslanOm opened 3 years ago

RuslanOm commented 3 years ago

Hi! Thanks for a great research and cool models. Can you explain please how to calculate RMSE for KITTI dataset? For example, we have output of model. It's an inverse relative depth and, of course, I can take 1 / output, but I still have no idea how to compare this values with GT KITTI depth (or other datasets with absolute depth). Thanks!

JJrodny commented 2 years ago

I've been trying to understand how MiDaS works with estimating depth, and I'm also curious how to do this. I read #4 #5 #36 #37 #42 #66 #124 #125 and as far as I understand it, even if you do have camera intrinsics, (fx, fy, cx, cy, image width, image height) multiple posts have mentioned that it's not possible to get absolute depth values in meters unless you have 2+ pixels with the ground truth depth values.

How do you calculate a model's performance on KITTI, or any other depth dataset if that dataset's ground truth values are in real depth (e.g. meters) and this model only outputs relative depth?

Also, why do you report δ>1.25 error values when all other state of the art papers report δ<1.25 error?

JJrodny commented 2 years ago

Rubber ducking myself this link here looks like how to calculate the error https://github.com/isl-org/DPT/blob/main/EVALUATION.md

and then I just saw in your supplementary materials section of your paper:

Alignment. We align the scale and shift of all predictions (our models as well as baselines) to the ground truth before conducting evaluations. We perform the alignment in inverse-depth space based on the least-squares criterion.

So you measure MiDaS's relative accuracy instead of absolute accuracy by converting the GT values to be 0-1 and converting MiDaS's output to be 0-1 (subtract minimum then divide maximum by the value (to convert from inverse))

crnl123 commented 4 months ago

Rubber ducking myself this link here looks like how to calculate the error https://github.com/isl-org/DPT/blob/main/EVALUATION.md

and then I just saw in your supplementary materials section of your paper:

Alignment. We align the scale and shift of all predictions (our models as well as baselines) to the ground truth before conducting evaluations. We perform the alignment in inverse-depth space based on the least-squares criterion.

So you measure MiDaS's relative accuracy instead of absolute accuracy by converting the GT values to be 0-1 and converting MiDaS's output to be 0-1 (subtract minimum then divide maximum by the value (to convert from inverse))

How do you deal with infinity that arises from division by zero, say for the sky? KITTI as far as I can tell do not record those distances to be infinite, while most general models output 0 or some other low value for sky.