Open Serendipity-LC opened 4 years ago
I3D requires stacking multiple frames as input. If you want dense features, you need to densely sample 64 frames as input with one frame as stride.
However, we still can not guarrantee n_frames*1024 featrures.
Hi!If n_frames represents the frames of a video, How can I extract features of size n_frames*1024 with rgb_videos as the input?