Open DataAIPlayer opened 1 month ago
Why don't you lower your image resolution?? With batch size of 1, 512 x 512 x 25 frames runs for A6000 which has 48G of VRAM.
I am experimenting with fine-tuning motion LoRA and need to generate videos at a resolution of 1024x576. Do you mean that training motion LoRA at a lower resolution can achieve camera control effects when inferring at a higher resolution?
Oh if you must get that resolution, this might not help :( Using xformers and gradient checkpointing might help
Is there an open source SVD fine-tuning method?
I tried using LoRA to fine-tune the U-Net with SVD, and even with a batch size of 1, memory overflow occurs on the A100-80G GPU when the dataset consists of 25-frame videos. And I tried using DeepSpeed, but it was ineffective. Does this mean that model parallel training must be employed, distributing the model parameters across multiple GPUs?