MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
https://arxiv.org/abs/2203.12602
Other
1.37k stars 136 forks source link

Add the length of the video when preparing datasets #24

Closed JiayuZou2020 closed 2 years ago

JiayuZou2020 commented 2 years ago

According to the code to load data, it seems we should add the length of each video when preparing datasets. See https://github.com/MCG-NJU/VideoMAE/blob/main/DATASET.md for details. The generated annotation of video datasets should be like: dataset_root/video_1.mp4 100 label_1, where 100 means the length of video_1.mp4.

yztongzhan commented 2 years ago

Hi @JiayuZou2020 ! There is no need to add the length of each video. Please check the code below: https://github.com/MCG-NJU/VideoMAE/blob/e93f386cffee601cbb01882024255a1dbc23128f/kinetics.py#L253

klinic commented 2 years ago

According to the code, during the stage of pretrain, I agree to your point that we should add the length of each video when preparing datasets.