HHTseng / video-classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
936 stars 216 forks source link

Variable Length sequences #5

Closed digbose92 closed 5 years ago

digbose92 commented 5 years ago

Since the videos are composed of different number of frames, the inputs to the LSTM from the CNN encoder network will be of different lengths within the same batch. How is the network handling variable lengths of the videos ? There is no mention about padding anywhere in the code.

lee-man commented 5 years ago

For ResNetCRNN, it uses frames from 1 to 29 as input.

HHTseng commented 5 years ago

Thank you @digbose92. @lee-man is right, I used fixed frame size 29 (frame number: 1~29). Sorry that I didn't make it very clear, where it was described in the repo: The minimal frame number 28 is the consensus of all videos in UCF101. However, I do have the code for variable length with ResNetCRNN, please give me 1 week or so to organize the code.

digbose92 commented 5 years ago

Hi @HHTseng, so for all the videos in UCF101, 29 is the minimum number of frames. So each batch has a size of (batch size, 29, embedding size) if batch first is used? And when the batches are created for different videos, only the first 29 frames are used ? For videos having more than 29 frames, then there is truncation. The variables begin_frame, end_frame and skip_frame are not updated.

HHTseng commented 5 years ago

Yes, @digbose92. All of them are correct as you mentioned: (1) True: (batch size, 29, embedding size) if batch_first is used (2)Yes: And when the batches are created for different videos, only the first 29 frames are used ? (3) True: For videos having more than 29 frames, then there is truncation. (4) True: variables begin_frame, end_frame and skip_frame are not updated.

digbose92 commented 5 years ago

Hi @HHTseng thanks for the clarification.

HHTseng commented 5 years ago

No problem, @digbose92! Thanks for your questions too.

abhiray92 commented 3 years ago

Hi @HHTseng Can we have the code for the videos with variable length/number of frames?

abhiray92 commented 3 years ago

@digbose92 Did you manage to handle the frames of varying length?