autonomousvision / carla_garage

[ICCV'23] Hidden Biases of End-to-End Driving Models
MIT License
203 stars 16 forks source link

Why we need different setting for different benchmark? #9

Closed Naive-Bayes closed 1 year ago

Naive-Bayes commented 1 year ago

Thanks for such wonderful work and release the corresponding code, datasets and model.

I do not understand why we need different setting for different benchmark? just as the readme ''' Models have inference options that can be set via environment variables. For the longest6 model you need to set export UNCERTAINTY_THRESHOLD=0.33, for the LAV model export STOP_CONTROL=1 and for the leaderboard model export DIRECT=0 '''

Does it mean for different benchmark, we use the different method, furthermore, dose it mean the method which work well on Longest6 do not work so well on LAV? What are the meaning and difference between these setting?

Thank you!

Kait0 commented 1 year ago

Maybe you should check out the appendix.

As for STOP_CONTROL=1: LAV is the only benchmark that considers stop sign infractions, so this feature makes no sense for the other 2 benchmarks hence it is turned off by default and you need to turn it on for LAV. The impact of the stop controller is discussed in Table 11.

As for UNCERTAINTY_THRESHOLD=0.33 Table 14 investigates the impact of this threshold for the LAV benchmark. Since it doesn't matter much at that benchmark I left it at the default of 0.5. For the longest6 this threshold does make a few (~3-4) DS difference, due to the dense traffic, hence I tuned it to 0.33. Even if you leave it to the default value you will still get a state-of-art result, but the instructions above are to reproduce the results in the paper.

As for DIRECT=0 In principle the code supports models with both path + target speed (direct=1) and waypoints (direct=0) as output representation (though i did not end up using that feature), hence you need to pick which one you want to use during inference. As discussed in C.6. CARLA Leaderboard the model there is a TF++ Waypoint which is indeed different to the other TF++ (which use path + target speed as output. You need to set DIRECT to 0 otherwise the inference code will crash, because path+ target speed are not trained for this model.

dose it mean the method which work well on Longest6 do not work so well on LAV?

Not really, the difference is only in the controller. You can also use STOP_CONTROL=1 on longest6, it just would not do anything.

Naive-Bayes commented 1 year ago

Maybe you should check out the appendix.

As for STOP_CONTROL=1: LAV is the only benchmark that considers stop sign infractions, so this feature makes no sense for the other 2 benchmarks hence it is turned off by default and you need to turn it on for LAV. The impact of the stop controller is discussed in Table 11.

As for UNCERTAINTY_THRESHOLD=0.33 Table 14 investigates the impact of this threshold for the LAV benchmark. Since it doesn't matter much at that benchmark I left it at the default of 0.5. For the longest6 this threshold does make a few (~3-4) DS difference, due to the dense traffic, hence I tuned it to 0.33. Even if you leave it to the default value you will still get a state-of-art result, but the instructions above are to reproduce the results in the paper.

As for DIRECT=0 In principle the code supports models with both path + target speed (direct=1) and waypoints (direct=0) as output representation (though i did not end up using that feature), hence you need to pick which one you want to use during inference. As discussed in C.6. CARLA Leaderboard the model there is a TF++ Waypoint which is indeed different to the other TF++ (which use path + target speed as output. You need to set DIRECT to 0 otherwise the inference code will crash, because path+ target speed are not trained for this model.

dose it mean the method which work well on Longest6 do not work so well on LAV?

Not really, the difference is only in the controller. You can also use STOP_CONTROL=1 on longest6, it just would not do anything.

Thanks for your answer, now I understand the influence of STOP_CONTROL and UNCENTAINTY_THRESHOLD.

What's more, I found 3 different results on CARLA LeaderBoard v1.0. In my opinion, "TF++ WP"(DS 61.57) refer to DIRECT=0 and "TF++"(DS 52.82) refer to DIRECT=1, is it right?

Kait0 commented 1 year ago

Yes this is correct. Both are also different models one trained with path + target speed as output representation and one with waypoints. The leaderboard show 2 more results because they were not finished when I put the paper on ArXiv. Will include them in the paper once I finished the camera ready version for ICCV.