Thinklab-SJTU / Bench2Drive

[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
Apache License 2.0
1.24k stars 79 forks source link

Closed-loop evaluation resume #20

Closed StevenJ308 closed 2 months ago

StevenJ308 commented 3 months ago

Hi, thanks for making the code public, I'm having some problems reproducing the closed loop eval. Since I am single gpu, I am using the debug mode directly for the evaluation, but every time I can't run through 220 routes and crahsed. when I reevaluate it, it will start the evaluation from the beginning. I see that resume=True has been set in the shell script, why would it still start from the beginning? What should I do to avoid evaluating routes that have already been evaluated? Thank you very much!

jayyoung0802 commented 2 months ago

resume=True works well. It uses the json file to determine the route id of the evaluation, please keep the file path unchanged. And if one route always crashed(may be caused by agent behavior), you can skip it and 'progress'-1 manually.

StevenJ308 commented 2 months ago

Thank you very much for your answer. I looked at the code again and realized that in the while loop of the _validate_andresume function, it checks the status of the tested route, and my json file shows that the status of the initial route is Failed, which leads to restarting the test every time~