zhaoyue-zephyrus / AVION

Code release for "Training a Large Video Model on a Single Machine in a Day"
http://arxiv.org/abs/2309.16669
MIT License
100 stars 4 forks source link

RandAugment usage #8

Open rccchoudhury opened 1 month ago

rccchoudhury commented 1 month ago

I noticed that RandAugment is only used in the Kinetics dataloader if fast_rrc is off. Does this mean that RandAugment was not used for pre-training or finetuning? I also noticed even if RandAugment is moved to the GPU along with other transforms, data loading speed is quite a bit slower. Have you seen this issue before?

zhaoyue-zephyrus commented 1 month ago

Hi @rccchoudhury ,

Great point. Yes, RandAugment is not supported in the fast-loading mode. And yes It is not used for pre-training. For more detail, please refer to Table 9 of the supplementary material of VideoMAE. My hunch is that 80-90% masking also serves as sufficiently strong augmentation and you don't need this for pre-training phase.

For fine-tuning, it is used instead. RandAugment indeed takes more time than regular augmentations whether it is done on the CPU or GPU end. However, fine-tuning usually takes much fewer epochs (50 to 100 epochs) than pre-training (800 to 1600 epochs). Therefore, keeping RandAugment in the fine-tuning stage won't affect the overall speedup.

Hope this clarification helps.

Best, Yue