Pretrained smaller models availability

OpenGVLab / VideoMAEv2

[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking

https://arxiv.org/abs/2303.16727

MIT License

524 stars 63 forks source link

Closed ganzobtn closed 8 months ago

ganzobtn commented 12 months ago

Hello. Thank you for the great work.

Could you provide me with the ViT-B adn ViT-S model?
How much GPU VRAM required when I fine-tune pretrained ViT-G model on custom video dataset? When I try to finetune it with batch size of 1 on V100 with 32GB memory, it is showing CUDA out of memory error. Is there sth wrong with what I am doing?

congee524 commented 12 months ago

vit_b_hybrid_pt_800e.pth
we fine-tune vit-g with batch_size=6 on 80G-A100. Kindly check your pytorch version (the higher, the better) or you could use checkpointing.

hope it helps.