HumanAIGC / AnimateAnyone

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation
Apache License 2.0
14.23k stars 952 forks source link

Some problems of my unofficial implementation #43

Open MingtaoGuo opened 8 months ago

MingtaoGuo commented 8 months ago

Hi,

I have unofficially reproduced the code for 'Animate Anyone' based on the description in your paper. However, I encountered two issues during the training process:

Currently, with a single GPU and a batch size of 2, I have trained for 8k iterations. The generated images show a significant difference in the background compared to the target images, which are pure white. The third row in the following figure.

The faces reconstructed by the VAE decoding exhibit distortion. I'm wondering if it's possible to utilize the latent diffusion model to capture the information lost by the VAE and correct the distorted faces. In your video demo, the faces appear clear, and I'm unsure how to address this issue.

image