Closed alantes closed 1 year ago
Hi @alantes,
Thanks for your interest. The per-agent coordinate system transformation is conducted inside the forward pass, not in the data preprocessing part.
During data processing, you can pick an arbitrary global coordinate system, and the prediction result will be the same. I set the global coordinate system to be centered at the autonomous vehicle just for my convenience of doing ablation studies.
In the model forward pass, you can pay attention to the argument data['rotate_mat'] inside local_encoder.py
and global_interactor.py
. This argument will be used to rotate the input in an agent-centric manner.
Thank you for your quick response. So, I see the rotate_mat
is from rotate_angles
, which is calculated in process_argoverse()
. Every other agent is going to rotate according to a certain agent's heading. Cool!
May I ask where the transition part is? Is there a variable that decides how the coordinates of other agents will change?
Translation invariance is achieved "automatically".
Looking at this line in data processing, each time step's input token is represented as the difference between consecutive coordinates. If you add a constant to two coordinates, their difference will not change.
On the other hand, in the attention layer, the relative position between agents is concatenated to the key and value embedding. The relative position is also a vector that is translation-invariant.
Thank you for your patient reply. That is inspiring!
Hello, Dr. Zhou! Thank you for sharing your elegant code. But I have a question:
It seems that you make the scene centered at autonomous vehicle in code:
and make the coordinates of other vehicles based on that origin.
But according to your paper, shouldn't the scenes be centered at separate agents? Where are the code that corresponds to this part?
Thank you in advance!