OpenDriveLab / UniAD

[CVPR 2023 Best Paper Award] Planning-oriented Autonomous Driving
Apache License 2.0
3.34k stars 368 forks source link

Could you provide the configs / code for each task, e.g. Motion Forecastint only ? #14

Closed foolhard closed 1 year ago

foolhard commented 1 year ago

Hello,

Congratulation for your great job and thanks for sharing the code. In paper you have done the ablations on the effectiveness of each task as below. How do you do the forecasting tasks if you don't have Detection/Tracking task? Could you provide the configs or code for each specific task, e.g. Motion Forecasting only (ID-4), Occ Prediction only (ID-7), and Planning only (ID-10).

Ablations on the effectiveness of each task

YTEP-ZHI commented 1 year ago

Hi @foolhard, thanks for your interest and kind words.

After submitting our paper, we did several rounds of major code refactor, trying to make it clean and readable. Therefore those checkpoints for multiple ablations are unable to be used in current repo. Also, due to our limited GPU resources, I'm afraid we cannot afford to re-run all the experiments in a short period of time (It will be in our TODO list for sure).

But I can tell you the implementation of these tasks if you are interested. In fact, we take detection as a substitude when removing tracking module in ablation studies, simply because detection queries are formulated similar to track queries but unable to associate agents across time. So all the motion-, occupancy- and planning-only experiments are conducted with a pretrained BEVFormer model, which is further co-trained with these single heads. Thus the BEV feature and detection query are available in these ablations.

foolhard commented 1 year ago

@YTEP-ZHI thanks a lot for your kindly explanation. It's very clear and comprehensive. I fully understood your situation and how you implemented your ablation test. Congratulation again to your great job!

ZhoubinXM commented 1 year ago

Hi @foolhard, thanks for your interest and kind words.

After submitting our paper, we did several rounds of major code refactor, trying to make it clean and readable. Therefore those checkpoints for multiple ablations are unable to be used in current repo. Also, due to our limited GPU resources, I'm afraid we cannot afford to re-run all the experiments in a short period of time (It will be in our TODO list for sure).

But I can tell you the implementation of these tasks if you are interested. In fact, we take detection as a substitude when removing tracking module in ablation studies, simply because detection queries are formulated similar to track queries but unable to associate agents across time. So all the motion-, occupancy- and planning-only experiments are conducted with a pretrained BEVFormer model, which is further co-trained with these single heads. Thus the BEV feature and detection query are available in these ablations.

  • Motion-only (In MotionFormer) Replace the track query with detection query in agent-agent interaction module. Remove agent-map interaction module, which takes map query as key and value to update motion query.
  • Occupancy-only (In OccFormer) Agent feature, which is designed as a fusion of motion and track query, is replaced with detection query.
  • Planning-only (In Planner) Remove the ego-query, and only use a learnable embedding to attend to the BEV feature and predict future waypoints.

Hello, if I want to reproduce the Motionformer only ablation experiment, how can I change the code in the Motion Head and Uniad Track so that de t query replaces the track query, and I find that there are some remaining parameters used in the motionformer, e.g., match idx, sdc embedding...

OrangeSodahub commented 1 month ago

@YTEP-ZHI Hi, thanks for your guidance. I followed to try Planning-only method, removed all sdc queries (sdc_track, sdc_traj), and only use navi_embed to attend to bev features to get outputs. However, the training process does not converge at all, all the collision loss keep zero, and loss_ade keep around 0-10.

I douted did the planning head needs sdc queries as the input? Or something else is wrong, do you have some advice?