Finspire13 / CMCS-Temporal-Action-Localization

Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization (CVPR2019)
MIT License
150 stars 17 forks source link

the size of extracted features #28

Closed simpler66 closed 3 years ago

simpler66 commented 4 years ago

is:issue is:open

Hi, thank you very much for sharing your code! Quick question about I3D features.

You mention that 'I3D takes non-overlapping 16-frame chuncks as input for both two stream' in 4.2 Implementatino Details. When I borrow the released codes in https://github.com/piergiaj/pytorch-i3d for extracting I3D features, I notice that the I3D network has three layers (Conv3d_1a_7x7, MaxPool3d_4a_3x3, MaxPool3d_5a_2x2) whose stride is 2. Thus, the final features I get are sampled by 8. I wonder if I missed some processes and how to get the features sampled by 16.

Finspire13 commented 3 years ago

Please refer to #29 #6