A question about AAEncoder

ZikangZhou / HiVT

[CVPR 2022] HiVT: Hierarchical Vector Transformer for Multi-Agent Motion Prediction

Apache License 2.0

577 stars 115 forks source link

Thanks for contributing such amazing work! Just a question, when we compute the cross-attention for the center agent and its neighbor agents, why do we index the edge_index[1] as rotate_mat for x_j (the neighbor agents) rather than edge_index[0]? As far as I know, the edge_index[0] represents the source, i.e., the center agent, and the edge_index[1] represents the target, i.e., the neighbor agents. Here we want to rotate the neighbor agents according to the center agent angles \theta. Thus, I think rotate_mat[edge_index[0]] is the rotate_mat parametrized by the center agent angle \theta, which is used to rotate neighbor agents.

https://github.com/ZikangZhou/HiVT/blob/6876656ce7671982ebdc29113aaaa028c2931518/models/local_encoder.py#L184

ZikangZhou / HiVT

A question about AAEncoder #29