Closed rashidch closed 4 months ago
Hi @rashidch
Thanks for your interest!
As you mentioned we used one encoder for both trajectory information and person ID. Since trajectory data and person ID are relatively compact data compared with other visual cues (e.g., pose keypoints), encoding them together is more efficient than encoding independently with two different modules.
Hope this brief explanation helps you understand!
Hi @yanggao2000
Can you also release code and prepared data for ETH/UCY dataset and Pedestrians and Cyclists in Road Traffic dataset?
Hi @rashidch
Unfortunately, we currently don't have plans to release the processed data for those datasets. However, one can obtain them by referring to our code for preparing the JTA dataset (https://drive.google.com/drive/folders/1iIp2B5y85OKZ7DkW7gBkmulUxMWW5SkY) and adjusting it, as we used the same method to prepare those data.
Hi,
thank you for your awesome research project.
Can you please provide an explanation for why Trajectory and Person ID encoding are used together?
Why do you add learned encoding to
(0 to 128), x[:, :, :, 0:half*2:2]
and person encoding to(1 to 128), x[:, :, :, 1:half*2:2]