HKUST-Aerial-Robotics / SIMPL

SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving
MIT License
161 stars 13 forks source link

target-centric and scene-centric #8

Closed lsr12345 closed 2 months ago

lsr12345 commented 2 months ago

Congratulations on the excellent work! I have noticed that in the Argoverse2 data processing, a target-centric coordinate rotation method is utilized instead of a scene-centric approach. I am curious about the impact of this choice on the performance of multi-agent prediction as demonstrated in the paper. Specifically, I would like to inquire: If there have been any comparative analyses between target-centric and scene-centric coordinate rotation methods, could you please share the findings or insights? Thank you for your attention to this matter, and I look forward to any clarifications or additional information you can provide.

MasterIzumi commented 2 months ago

@lsr12345 Thanks!

Actually, the normalization process you mentioned is not necessary, the actual input of the network is instance-centric features and their RPE, no global coordinates are involved in these inputs. If we override the origin and rotation calculation in data_av2/av2_preprocess.py (L62), the evaluation results will be the same. image The reason we keep this code is for visualization and debugging.

For the second question, if you are asking why we center on the target agent in the preprocessing rather than center on the ego agent (AV), this is basically due to we have to make sure the target agent is involved during the training and evaluation. In some samples, AV and the target agent are pretty far away, if we center on AV, the target agent will be out of range. Besides, the official evaluation focuses on the target agent, so we center on the target agent to get better performance on the benchmark. But in practice, we can just set the AV as the origin, for example, in the attached video, we show the qualitative results of the Argoverse tracking dataset, here we crop the surrounding scenes centered on AV, which is more similar to the situation on the real autonomous driving systems.

lsr12345 commented 2 months ago

THX!

penglo commented 1 month ago

Congratulations on the excellent work! I have noticed that in the Argoverse2 data processing, a target-centric coordinate rotation method is utilized instead of a scene-centric approach. I am curious about the impact of this choice on the performance of multi-agent prediction as demonstrated in the paper. Specifically, I would like to inquire: If there have been any comparative analyses between target-centric and scene-centric coordinate rotation methods, could you please share the findings or insights? Thank you for your attention to this matter, and I look forward to any clarifications or additional information you can provide.

Hello, I also have some questions about coordinate transformation as a beginner. May I ask you some questions about it? If it's convenient for you, could you please leave your email so I can contact you? I would greatly appreciate it. My email is lipl23@mails.jlu.edu.cn.