noahzn / Lite-Mono

[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
MIT License
540 stars 61 forks source link

About the estimated depth distance #86

Closed lmj250 closed 11 months ago

lmj250 commented 12 months ago

Hello, author! I want to know how to get the actual predicted depth value when testing a single image. Whether the 'pred_depth' in the 'evaluate_depth.py' is the predicted depth value? Here, can I apply the pred_depth calculation in the 'evaluate_depth.py' to 'test_simple.py' to get predicted depth value in a single image test?

noahzn commented 12 months ago

Hi, you cannot get absolute depth values using this method. Please see our discussion here.

lmj250 commented 12 months ago

Thank you for replying. During training, the scaling factor is obtained by combining the ground truth and the predicted depth value. In the subsequent test or evaluation, the predicted depth value is converted into an absolute depth value by using the scaling factor. However, the corresponding scaling factor cannot be obtained when the image of the non-KITTI dataset is tested. Whether I understood your expression correctly.

noahzn commented 12 months ago

Hi, yes, you are correct. The evaluation is based on median scaling. If you don't have the ground-truth, you cannot evaluate the accuracy.

lmj250 commented 11 months ago

I'm still a little confused. If the ground-truth is required to predict the absolute depth distance, how to predict the depth distance in real time and apply it to autonomous driving. I've seen elsewhere that this scaling factor is limited by camera parameters, not sure if that's correct.

noahzn commented 11 months ago

“If the ground-truth is required to predict the absolute depth distance”

No, the ground-truth is only needed to evaluate the performance of the model. It's not used to predict the absolute depth distance, because we cannot get a perfect scale factor by using median scaling.

For self-supervised methods, they are used to predict relative depth, not absolute depth. Therefore, you cannot directly use these methods for absolute depth estimation in autonomous driving. If you want to get rough absolute depth while you don't have any ground-truth. You can try to get a baseline from stereo methods, then you get a factor to scale the predicted depth. Please select the appropriate method according to your application and tolerance for error.

noahzn commented 11 months ago

I am closing this issue due to no response.