tridivb / slowfast_feature_extractor

Feature Extractor module for videos using the PySlowFast framework
MIT License
77 stars 24 forks source link

video frame_list #6

Closed chen849157649 closed 4 years ago

chen849157649 commented 4 years ago

start = int(index - self.step_size self.out_size / 2) end = int(index + self.step_size self.out_size / 2)

index (int): the video index provided by the pytorch sampler. What does the equation mean?, How can video index and 'self.step_size * self.out_size / 2'?

tridivb commented 4 years ago

Hello,

Thank you for your interest in the framework. The step size is the frame drop rate or ratio of the input fps of the given frames or video and desired output fps .

self.in_fps = cfg.DATA.IN_FPS
self.out_fps = cfg.DATA.OUT_FPS
self.step_size = int(self.in_fps / self.out_fps)

The out_size in this case is the size of the window (different from out_fps) or number of frames in the pathway which is to be passed to the network.

eg. Consider the input video fps as 60 and desired sampling, output fps as 30 and out_size as 32 (cfg.DATA.NUM_FRAMES in config file). Then step size=60/30=2. Now the sampling window is adjusted according to this step size. Consider the provided index as the center of the window. Then there are 16 frames sampled from the left of the center and 15 from the right (the center is inclusive). In this case, every second frame is sampled as the step size is 2.

If you provided the same output fps to be same as input fps, then the sampling will select only adjacent frames.

tridivb commented 4 years ago

Closing the issue due to lack of follow-up.