Problems about results - Githubissues

hehefan / P4Transformer

Implementation of the "Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos" paper.

MIT License

165 stars 24 forks source link

Problems about results #17

Open ghost opened 2 years ago

ghost commented 2 years ago

Hi, I try to run the code and get the result as a baseline. I didn't change the parameters and set the clip len=24, batch size=14, but I only got 89.55 acc for training 100 epochs, while the paper said it should be 90.94. I just don't know why the results are different. I upload my training process, hope you can give me some possible reasons. out_P4.txt

hehefan commented 2 years ago

Hi,

I assume you are with MSRAction-3D.

Due to the small scale of the dataset and the randomness of data augmentation, different experimental environments may cause a fluctuation of about 1%, as discussed in issue. Please try to align your machine with the proposed.

Alternatively, tuning the hyper-parameters based on your environment may increase the accuracy.

Best.

sheshap commented 2 years ago

I did get 90.94% with clip len = 24. However, with clip len = 4, I only got 70.37% while paper claims 80.13%. log.txt

hehefan commented 2 years ago

The code is for the 16-frame or 24-frame setting. For short video clips, you need to 1) close temporal stride by setting "temporal-stride" to 1. 2) remove temporal padding, i.e, temporal_padding=[0, 0] in msr.py.
3) reduce temple kernel size, e.g., setting "temporal-kernel-size" to 1. 4) increase frame intervals, e.g., setting "frame-interval" to 2 or 3.

sheshap commented 2 years ago

Hi, Thanks for your update.

I have modified 1,2,3 please let us know where is 4. frame-interval?

Thanks.

hehefan commented 2 years ago

Sorry... I updated the code. Please check train-msr.py and msr.py

sheshap commented 2 years ago

Thank you so much. It is working now for frames = 4 and 8. Do these rules also apply to other models like PSTNet, PST-Transformer, and PSTNet2? I mean for experiments that involve 4 and 8 frames.

hehefan commented 2 years ago

Correct.