ViT-S and ViT-H models on huggingface

MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training

https://arxiv.org/abs/2203.12602

Other

1.38k stars 136 forks source link

ViT-S and ViT-H models on huggingface #88

Open sandstorm12 opened 1 year ago

sandstorm12 commented 1 year ago

Hello.

Do you plan on releasing the kinetics pre-trained weights for VideoMAE-small and VideoMAE-huge on your hugging face repo? Are there publicly available?