Closed xuan301 closed 2 years ago
Hi, for THUMOS14 dataset, we use the feature provided by CMCS. The video has around 5094 frames. The temporal stride is 4, Then we can get (5094 -16) / 4 ~= 1269 feature points for the video.
If the video has 20 frames, when the temperal stride is 4, should we get 2 feature points? They are derived from 0-16 frames and 4-20 frames, respectively. But according to (20-16)/4=1, we can only get 1 feature point.
The CMCS uses pytorch-i3d-feature-extraction to extract the features. Based on the code, I think we can only get 1 feature point.
OK, thanks for your help~
Thanks for resolving this issue, Chenlin. Mark as closed.
For example, the duration of video_validation_0000051 is 169.79 and the fps is 30. How to get 1269 in its i3d feature dimension (1269,2048)?