happyharrycn / actionformer_release

Code release for ActionFormer (ECCV 2022)
MIT License
419 stars 77 forks source link

About thumos14 features using TSP #76

Closed weike382 closed 1 year ago

weike382 commented 1 year ago

Hi, Thank you for your excellent work. I have a question about how to extract thumos14 features using TSP? If you can, please disclose the details or features. Looking forward to your reply. Thank you.

tzzcl commented 1 year ago

We directly use the feature provided by official TSP repo. Please refer to the TSP repo for features. They also have a tutorial to extract features.

weike382 commented 1 year ago

Yes, so do I. use this Details: The features are extracted from the R(2+1)D-34 encoder pretrained with TSP on THUMOS14 (released model) using clips of 16 frames at a frame rate of 15 fps and a stride of 1 frame (i.e., dense overlapping clips). This gives one feature vector per 1/15 ~= 0.067 seconds. And the thoumos_tsp.yaml config is as follows, some configuration parameters have been modified. such as file_ext: .pkl, input_dim: 512,feat_stride: 1,max_seq_len: 4608(2304 misrun). 图片

But my effect is not good. Can you give me some advice.Tnk. 2022-12-28 09-38-59屏幕截图

happyharrycn commented 1 year ago

By a quick look at your config file, the video frame rate was not set up properly. When default_fps is not specified for the dataset, the frame rate of each video is loaded from the json file (~30 Hz for THUMOS). For TSP features you are using, this has to be set to 15.

weike382 commented 1 year ago

The results are as follows |tIoU = 0.30: mAP = 65.02 (%) Recall@1x = 75.18 (%) Recall@5x = 89.61 (%) |tIoU = 0.40: mAP = 60.46 (%) Recall@1x = 70.56 (%) Recall@5x = 85.72 (%) |tIoU = 0.50: mAP = 52.23 (%) Recall@1x = 63.50 (%) Recall@5x = 78.22 (%) |tIoU = 0.60: mAP = 40.66 (%) Recall@1x = 53.29 (%) Recall@5x = 66.92 (%) |tIoU = 0.70: mAP = 28.60 (%) Recall@1x = 41.99 (%) Recall@5x = 53.77 (%) Average mAP: 49.39 (%). Things have improved. But not enough. At training, max_seq_len is 2304.When testing, I changed the max_seq_len to 4608, otherwise the run has bug. Because I'm not sure if the rest of my configuration is correct.Can I trouble you to provide the thoumos_tsp.yaml config? My email is 1731549205@qq.com and my WeChat is 18344064135. Finally,Thank you very much for your answer and wish you all the best.

happyharrycn commented 1 year ago

max_seq_len should have no effect at test time. The parameter only controls the sequence length during training, while at test time the full video sequence is fed to the model.

We can look into the config for using TSP features for THUMOS, yet that might happen after the winter break.

tzzcl commented 1 year ago

I did a quick check with the official THUMOS14 feature provides by TSP. I can reach 55-56 mAP on the THUMOS14 dataset (just copy the thumos14_i3d config file and modify the default_fps and feature_stride). I think we can definitely tune on hyperparameters to get better results. We may update this config shortly.

Another thing is, max_seq_len will affect the evaulation process (cause PointGenerator will buffer the point. And the point length is limited by max_seq_len * max_buffer_len_factor (we use 6 for max_buffer_len_factor in the config). If we encounter extremely long features sequences (25344 in the TSP THUMOS14 feature), which is larger than 2304 * 6. It will cause an error. A simple modification is just enlarge the max_buffer_len_factor from 6 to 12. That will solve the error.