abduallahmohamed / Social-STGCNN

Code for "Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction" CVPR 2020
MIT License
483 stars 141 forks source link

[Question about the paper] Permutation effect ? #43

Closed moonsh closed 3 years ago

moonsh commented 3 years ago

First of all, thank you for sharing your impressive work.

While I am reading your paper, I have some questions.

  1. How did you choose input sequence size as 8 frames and output size as 12 frames? These frame sizes showed the best performance?

  2. I wonder how the permutation did effect during training. These data orders showed the best performance? How did you decide the data order?

  3. Relative distance? How did you weigh the node influence if agents are far from each other? I think even though two agents are far from each other, if they move to the same goal from opposite position then I think they will get very high weight because of the relative location.
    I guess in this case they should get low weight because they are far away.

Thank you,

abduallahmohamed commented 3 years ago

Hi @moonsh , thanks for asking!

  1. The 8/12 is the benchmark settings for ETH/UCY datasets. All previous work and later ones use the same settings. In particular I'm not aware why 8/12 exactly, but my insights is given observations of 8 steps are you able to predict the next 8(which is fair) and how good are you beyond those 8 (which is the remaining 4).
  2. The order of the pedestrian is the same as parsed from the ETH/UCY datasets. Nonetheless, in the paper we mentioned that the permutation at the input stage doesn't effect the results.
  3. I used the same 1/L2 for all weights. I do also agree that the weighting should change based on the direction, which we didn't account for in our work at the time of publication. I do recall a later work on arxiv https://arxiv.org/pdf/2011.09214.pdf discussed this problem in particular where they built upon ours.
moonsh commented 3 years ago

Thank you! It's very clear. One thing I am still confused. In figure 2, just before the ST-GCNN the "V" is PxTxN. and this "V" is changed to TxP^xN before passing to TXP-CNN. What's the reason for this? P^xTxN wasn't trained well?

moonsh commented 3 years ago

I found the same question here. #29

Thank you very much. @abduallahmohamed