cunjunyu / STAR

[ECCV 2020] Code for "Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction"
MIT License
338 stars 78 forks source link

Positional encoding. #6

Closed segermans closed 3 years ago

segermans commented 3 years ago

Dear authors,

I thoroughly enjoyed reading through your paper 'Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction'. Thanks for the effort.

I do have a question however. In the transformer model, a positional encoding is added to the embedded input, as you guys also mention in section 2.1. However, I fail to detect how this positional encoding is implemented in your spatial & temporal Transformers.

Any help would be highly appreciated. Thanks!

BTW; I also sent an email, but got an error back. Therefore, i am also putting up my question here. If you have received this question twice, sorry for that.

cunjunyu commented 3 years ago

Thank you for raising the question. TLDR: We did not use the positional encoding.

For Spatial Transformer: In the spatial dimension, the feature of each node in the graph is the coordinate(position). Thus, we do not have to use positional encoding since the feature itself contains the positional information.

For Temporal Transformer: In the temporal dimension, we tried to use positional encoding at the beginning. Yet, we observed little improvement but more computational cost. Therefore, we decided not to use it in the final version. I believe there is still something we can explore.

Since I have left SenseTime, the email address in the paper is now deactivated. Please email me at cunjun.yu@gmail.com if you prefer to use email. Sorry for the inconvenience caused.

Thank you & Have a nice day.