HumanAIGC / AnimateAnyone

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Apache License 2.0
14.23k stars 952 forks source link

What is the least requires GPU memory? #26

Open ygtxr1997 opened 8 months ago

ygtxr1997 commented 8 months ago

I reproduced the code and trained it on V100 32GB, but the OOM still occured even when the batch_size=1, image_resolution=128x128, and fp16 amp training.

ksasso1028 commented 8 months ago

looking at the paper, seems like they used 4 a100s (not sure if they were 40 or 80gb) my guess is they were 80gb @ygtxr1997

ksasso1028 commented 8 months ago

https://arxiv.org/pdf/2311.17117.pdf#:~:text=In%20this%20paper%2C%20we%20present,%2D%20modate%20multi%2Dframe%20inputs.

ksasso1028 commented 8 months ago

@ygtxr1997 can you share ? I have some ideas on how to get training working by reworking the architecture a bit. Basically replacing the attention layers.

ygtxr1997 commented 8 months ago

@ygtxr1997 can you share ? I have some ideas on how to get training working by reworking the architecture a bit. Basically replacing the attention layers.

Sorry, I cannot share the code due to the confidentiality. My implementation is mainly based on [diffusers]() and very similar with MagicAnimate's code for your reference.

mahicool commented 8 months ago

Demo on MagicAnimate wont work, keeps saying " Error This application is too busy. Keep trying! "

On Thu, Dec 7, 2023 at 7:51 AM Ge Yuan @.***> wrote:

@ygtxr1997 https://github.com/ygtxr1997 can you share ? I have some ideas on how to get training working by reworking the architecture a bit. Basically replacing the attention layers.

Sorry, I cannot share the code due to the confidentiality. My implementation is mainly based on diffusers and very similar with MagicAnimate https://showlab.github.io/magicanimate/'s code for your reference.

— Reply to this email directly, view it on GitHub https://github.com/HumanAIGC/AnimateAnyone/issues/26#issuecomment-1844097665, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACC6H6GK53CQSLEFTY4TN5TYIERZVAVCNFSM6AAAAABAIXEQSGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBUGA4TONRWGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

ygtxr1997 commented 8 months ago

@mahicool Hugging face space seems to be used by many people currently. You can copy its hugging face space to your own private space with GPU or clone its code to your machine or Colab (which I did).

MingtaoGuo commented 8 months ago

I reimplemented the code by revising the official ControlNet repository, using the U-Net backbone sd1.5, which costs about 33GB of GPU memory with a batch size of 2.

ggenny commented 8 months ago

I reimplemented the code by revising the official ControlNet repository, using the U-Net backbone sd1.5, which costs about 33GB of GPU memory with a batch size of 2.

I also used this approach, although with differences compared to the original article, using 2 48GB A40 cards, fp16 no oom problem. I took very little inspiration from MagicAnimate. The most interesting project release is AnimateAnyone-unofficial