Closed kyoyachuan closed 1 year ago
Hi, I think there should be something wrong. For the scale-aware mode, the 'median' should be around 1. You should use nusc_scale.txt for evaluation and I wonder if this cause the evaluation wrong.
I figured out the main issue. The root cause is min_depth
which originally set as 0.1 instead of 0.5. I thought min_depth or max_depth was used for GT filtering and clamp the predictions only, but it was also used in recover the scale of disparity since the model's output was sigmoid.
After I changed it back to 0.1, the results was correct.
Loading depth weights...
Loading encoder weights...
Training model named: nusc_scale
There are 20096 training items and 6019 validation items
median: 1.1010782718658447
-> Evaluating 1
scale-ambiguous evaluation:
front
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.177 & 2.176 & 7.423 & 0.264 & 0.773 & 0.916 & 0.963 \\
front_left
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.282 & 3.206 & 7.126 & 0.348 & 0.656 & 0.837 & 0.914 \\
back_left
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.275 & 2.963 & 6.447 & 0.345 & 0.675 & 0.847 & 0.916 \\
back
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.219 & 2.330 & 7.382 & 0.310 & 0.706 & 0.884 & 0.947 \\
back_right
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.351 & 6.161 & 7.489 & 0.393 & 0.632 & 0.828 & 0.905 \\
front_right
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.324 & 5.660 & 7.809 & 0.383 & 0.646 & 0.835 & 0.911 \\
all
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.271 & 3.749 & 7.279 & 0.341 & 0.681 & 0.858 & 0.926 \\
scale-aware evaluation:
front
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.180 & 2.192 & 7.637 & 0.282 & 0.744 & 0.906 & 0.959 \\
front_left
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.261 & 2.751 & 7.099 & 0.371 & 0.646 & 0.832 & 0.908 \\
back_left
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.282 & 2.820 & 6.567 & 0.374 & 0.629 & 0.824 & 0.905 \\
back
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.213 & 2.343 & 7.667 & 0.326 & 0.692 & 0.872 & 0.941 \\
back_right
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.403 & 9.329 & 7.768 & 0.426 & 0.611 & 0.807 & 0.889 \\
front_right
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.339 & 6.971 & 8.062 & 0.407 & 0.643 & 0.823 & 0.898 \\
all
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.280 & 4.401 & 7.467 & 0.364 & 0.661 & 0.844 & 0.917 \\
I also tried with hardcoded min_depth in disp_to_depth
with 0.1 and use different setting in configs, the results was also reasonable.
Seems like min_depth
and max_depth
was highly dependent on your training setup, I suggest that should be mention in documents. Thanks :)
Hi, I tried to evaluate nuScenes validation set performance with your released model, which is
nusc_scale
model. Since this model should be scale-aware, I suppose the scale-aware evaluation results should be similar as you mentioned inREADME.md
, which isHowever, it turns out the result worse than expected. The scale-ambiguity evaluation was higher than scale-aware evaluation in the scale-aware model. The relevant output was shown as below
I export the GT by using
tools/export_gt_depth_nusc.py
withval
and usedconfigs/nusc_scale_pretrain.txt
for evaluation (most config stay same except I changed the min_depth to 0.5).Is this reasonable or where I am using it wrong, thank you.