How good model handle the different duration of clips?

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

https://arxiv.org/abs/2203.12602

Other

1.32k stars 131 forks source link

Closed hagonata closed 2 months ago

hagonata commented 9 months ago

It's okay if I will have 10 seconds with 2-5 seconds duration clips in one class?