luosiallen / Diff-Foley

Diff-Foley: Synchronized Video-to-Audio Synthesis with Latent Diffusion Models
Apache License 2.0
147 stars 15 forks source link

Large batch size question #7

Closed juliawilkins closed 8 months ago

juliawilkins commented 9 months ago

In your paper, it says that you train the CAVP model with a batch size of 720 across 8 A100 GPUS, in other words 90 video samples per GPU. I am trying to reproduce your CAVP training pipeline and am also training on A100, but am struggling to scale the batch size beyond even 12 (per GPU) due to GPU memory limits (CUDA OOM errors). Would you be able to share your data loading code, or provide any tips for increasing batch size within an A100's memory limits in this sort of framework? Thank you very much!

luosiallen commented 8 months ago

Code provided.