Closed murilovarges closed 6 years ago
Yes, it uses temporal jitter, which is implemented here https://github.com/pytorch/pytorch/blob/master/caffe2/video/video_input_op.h#L235
Thank you @dutran,
In https://github.com/pytorch/pytorch/blob/master/caffe2/core/db.h#L232 If the cursor reaches its end, the reader will go back to the head of the db.
Yes, the reader will make about 4 times over the list, but due to temporal jittering, it is likely to see a different clip even with the same video.
Hi, @dutran May I know how much the accuracy improved by temporal jittering? Did you test it before? Thanks in advance.
Hi,
In the paper, you said "Although Kinetics has only about 240k training videos, we set epoch size to be 1M for temporal jittering".
I would like to know what will happen, the video reader will start from the begin of dataset? In this case will process the dataset about 4 times per epoch, in each time get a different clip for each video using temporal jittering?
I tried to find this information reading source codes in the folder pytorch/caffe2/video/. but I remain with doubt.
Thank's Murilo.