MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
https://arxiv.org/abs/2203.12602
Other
1.38k stars 136 forks source link

The Batch size and training epoch not metch with paper #103

Open Sumutan opened 1 year ago

Sumutan commented 1 year ago

Thank you for your open source project! The script for the finetune part corresponding to the 1600 pretrain in your provided scripts is different from the configuration given in the appendix of the paper: 1.The total batchsize in 512 (8 batch size 8 node 8 GPU)in paper,but 256((2 batch size 2 num_sample 8 node * 8 GPU))in script. 2.The training epoch was reduced from 75 rounds in the paper to 35 rounds. Would it be possible to achieve similar training results with this difference? image image