Train-test split in UNIV scene

InhwanBae commented 1 month ago

Hi @cocoon2wong ,

Thank you very much for making your excellent work and code available!

While testing your code, I discovered that the train-test split for the UNIV scene is different from that used in other studies. Typically, the set ['students001', 'students003'] is used for testing in the UNIV scene, with the remaining scenes as the training set.

https://github.com/zhangpur/SR-LSTM/blob/0d3a0136e302f0b6f607251a2f40277d1cd70b40/utils.py#L37-L38

self.data_dirs = ['./data/eth/univ', './data/eth/hotel',
                  './data/ucy/zara/zara01', './data/ucy/zara/zara02',
                  './data/ucy/univ/students001','data/ucy/univ/students003',
                  './data/ucy/univ/uni_examples','./data/ucy/zara/zara03']
...
if args.test_set==4 or args.test_set==5:
    self.test_set=[4,5]

In your implementation, however, it seems that only ['students001'] is assigned to the test set, and ['students003'] is used in training.

https://github.com/cocoon2wong/Vertical/blob/178866cf547150dc98d18817713e257aff7429f9/datasets/univ.plist#L5-L22

Given the complexity and predictive challenges of the UNIV scene, excluding it from the training data might affect performance adversely. Could you share the results if the train-test split and dataset are aligned with that of other papers for an apple-to-apple comparison? Does this issue also pertain to SocialCircle, which employs the same dataloader?

I appreciate your attention to this matter and look forward to your insights. Thank you again for your contributions to the field!

cocoon2wong commented 1 month ago

Thank you for your questions! @InhwanBae We apologize that we have not been concerned about this until now, we have started trying to train (through the attached file in /datasets, since the format limitations of GitHub it has been renamed into a txt file), this may take some time and we will get back to you as soon as we have our results!

If there are indeed significant differences between the results and those reported in the article, we will correct these results in the corresponding journal papers!

Another Vertical View: a Hierarchical Network for Heterogeneous Trajectory Prediction via Spectrums (Under review)
SocialCircle+: Learning the Angle-based Conditioned Interaction Representation for Pedestrian Trajectory Prediction (Under review)

univ13.txt

InhwanBae commented 1 month ago

Thank you for your prompt response. I appreciate the author's efforts to maintain fairness and transparency in the field of human trajectory prediction, and thank you for your assistance.

cocoon2wong commented 1 month ago

Hello again @InhwanBae !

We ran several rough experiments yesterday to determine the approximate performance of the va model on the corrected univ13 split, and the best test results so far are

>>> [Train Manager]: Test Results
    - ADE(Metrics): 0.24398671090602875 (meter).
    - FDE(Metrics): 0.43455140495300293 (meter).
    - Average Inference Time: 57 ms.
    - Fastest Inference Time: 46 ms.

These results will appear as 0.24/0.43 in the data table, meaning that the result on the average of ETH-UCY has been changed to 0.18/0.30.

We have attached the corresponding weights and the split file here ➡️ sample weights.zip. It is trained with the torch version of ~~SocialCircle codes (https://github.com/cocoon2wong/SocialCircle/tree/TorchVersion(beta))~~ SocialCirclePlus codes (https://github.com/cocoon2wong/SocialCirclePlus), since our tensorflow environment was broken on our server due to the NVIDIA driver reasons (apologize again). You can unzip the attached file, put the univ13.plist into the dataset_configs/ETH-UCY/ folder, and then run

python main.py -sc ${PATH_TO_YOUR_UNZIPED_FILES}/20240724-152139_6e-4_101_-1_vauniv13

to check these results.

Please note that these results are for your reference only at this point, we will continue to conduct experiments to determine the most suitable parameters for this split (e.g., learning rate), as this split is quite “anomalous” in that the ratio of training data to test data is almost 40%:60%, and the distribution of the training and test datasets is quite different, as you mentioned.

As soon as we have the final results, we will update them to the README page of the repositories involved, including:

Vertical (https://github.com/cocoon2wong/Vertical);
E-Vertical (https://github.com/cocoon2wong/E-Vertical);
SocialCircle (https://github.com/cocoon2wong/SocialCircle);
SocialCirclePlus (https://github.com/cocoon2wong/SocialCirclePlus).

During this period, we will leave this issue open.

Once again, we sincerely thank you for your questions and apologize for our mistakes in our work. We will make further efforts to be transparency in all our works. We also appreciate your contribution to this task!

Conghao Wong On behalf of All Authors

cocoon2wong / Vertical

Train-test split in UNIV scene #9