MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
https://arxiv.org/abs/2203.12602
Other
1.35k stars 136 forks source link

Can I fine-tine it on a video dataset of 32 frames? #102

Open Ha0Tang opened 1 year ago

Ha0Tang commented 1 year ago

The VideoMAE pre-trained models are trained on datasets of 16 frames per video. Can I fine-tine it on a video dataset of 32 frames?