Is anyone trying to train their own SVD-based Camera Motion LoRA model?

Stability-AI / generative-models

Generative Models by Stability AI

MIT License

23.14k stars 2.56k forks source link

Is anyone trying to train their own SVD-based Camera Motion LoRA model? #357

Open DataAIPlayer opened 1 month ago

DataAIPlayer commented 1 month ago

I tried using LoRA to fine-tune the U-Net with SVD, and even with a batch size of 1, memory overflow occurs on the A100-80G GPU when the dataset consists of 25-frame videos. And I tried using DeepSpeed, but it was ineffective. Does this mean that model parallel training must be employed, distributing the model parameters across multiple GPUs?

tykim0507 commented 1 month ago

Why don't you lower your image resolution?? With batch size of 1, 512 x 512 x 25 frames runs for A6000 which has 48G of VRAM.

DataAIPlayer commented 1 month ago

I am experimenting with fine-tuning motion LoRA and need to generate videos at a resolution of 1024x576. Do you mean that training motion LoRA at a lower resolution can achieve camera control effects when inferring at a higher resolution?

tykim0507 commented 1 month ago

Oh if you must get that resolution, this might not help :( Using xformers and gradient checkpointing might help

openchao commented 1 month ago

Is there an open source SVD fine-tuning method?

DataAIPlayer commented 1 month ago

https://github.com/alibaba/animate-anything/blob/main/train_svd.py