Closed xiefan233 closed 5 months ago
What batch size and frame length are you using? the model was trained on 80gbs.
40GB can be achievable but you really gotta pull a lot of tricks memory wise. Use gradient checkpointing, xformers, torch.cuda.empty_cache()
every step.
I am using your code for finetune, but I will always have "cuda out of memory" error. My experimental conditions are two 40GB A100, may I ask what experimental conditions are needed to run your code?