3.We follow the same way to extract clips from video as the C3D paper saying:'To extract C3D feature, a video is split into 16 frame long clips with a 8-frame overlap between two consecutive clips.These clips are passed to the C3D network to extract fc6 activations. These clip fc6 activations are averaged to form a 4096-dim video descriptor which is then followed by an L2-normalization' <
I think is not true cause you simply take a random index for the first frame from a certain sequence. After the firsts 16 frames are taken and a new sequence is considered. There is no mention of 50% overlapping in this code.
I think is not true cause you simply take a random index for the first frame from a certain sequence. After the firsts 16 frames are taken and a new sequence is considered. There is no mention of 50% overlapping in this code.