MCG-NJU / VideoMAE

[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
https://arxiv.org/abs/2203.12602
Other
1.38k stars 136 forks source link

The GPU memory usage of UCF and Kinetics is different. #109

Open Backdrop9019 opened 1 year ago

Backdrop9019 commented 1 year ago

Hello, thank you for the kind words about the code. When fine-tuning using your code on both UCF and Kinetics datasets, I've observed a significant difference in memory usage. Despite using the same number of frames (16 frames), batch size, and dataset class for both, the GPU memory consumption is almost three times higher for Kinetics than UCF. Specifically, with a batch size of 16 on an RTX 3090, Kinetics consumes about 22GB while UCF only uses around 8GB. Can you explain the reason for this discrepancy?