tumurzakov / AnimateDiff

AnimationDiff with train
Apache License 2.0
115 stars 26 forks source link

About training video length #20

Open Vincent-luo opened 2 months ago

Vincent-luo commented 2 months ago

Hello, I noticed that you're able to train on more than 300 frames using an A100 GPU. I'm curious about your training process - are you only training the to_q or the entire motion module?

I've been using the official AnimateDiff training script, and training on just 32 frames consumes about 30GB of VRAM. I'm wondering if you've implemented any optimizations to improve efficiency. It would be helpful if you could share some details about your training setup and any techniques you're using. Thanks!

tumurzakov commented 2 months ago

Now i training lora 1024x576x3 and it tooks 23.8 GB on my 3090.

  1. memory offload everything that don't needed for train (vae, text_encoder)
  2. precache samples (encode latents and embeddings into pth)
  3. keep an eye on gradients
  4. i'm using my own framework latentflow https://github.com/tumurzakov/latentflow. I could be hard to understand and use it, but you could try to look at train code. May be it will be useful for you
Vincent-luo commented 2 months ago

Thanks for the suggestions! I'll give them a try. I've noticed that the official AnimateDiff code doesn't use gradient checkpointing by default, and it can save lots of GPU memory.

tumurzakov commented 2 months ago

Yes, i'm using checkpointing too