Closed Sushant-aggarwal closed 5 years ago
Hi @Sushant-aggarwal , The "time_start" and "time_start" columns of the Kinetics .csv files, contain the time index in seconds, that corresponds to the original youtube video. If you use the Kinetics video crawler from the original repository of ActivityNet, you'll notice that the crawler chops the videos so that the final downloaded videos contain only these specific time segments.
what is the procedure of the temporal crop from each video in UCF-101 dataset. have you randomly take 16 consecutive frames if not how did you take 16 frames from one video?
For all the trainings, TemporalRandomCropping is applied which takes sequential 16 frames from a randomly selected place in the video. Pleasae also check out the downsampling option in the temporal_transforms. You can checkout the implementation of temporal augmentation at "temporal_transforms.py".
So for all the epochs you selected the same random portion from the video or for each epoch different random consecutive frames are taken from the video?
@Sushant-aggarwal It is the latter option. In every batch creation the temporal augmentation is done.So yes in each epoch a new randomly selected portion of the video (consecutive frames) are fed to the network.
I am closing this issue as it has been resolved.
what does begin_ index and end_index signifies in the annotations of kinetics dataset in the csv file since i have observed that it's not the frame number neither the time is seconds so what exactly it is and do u temporal cut the videos? thank you