ali-vilab / UniAnimate

Code for Paper "UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation".
https://unianimate.github.io/
524 stars 29 forks source link

关于测试有奇怪结果? #8

Open zhanghongyong123456 opened 2 weeks ago

zhanghongyong123456 commented 2 weeks ago

0023 0018 这样的结果如何处理,为何手臂如此长,预处理过程没有对齐吗

wangxiang1230 commented 2 weeks ago

Hi, there are two possible reasons:

  1. There is a problem with the original pose extraction, that is, DwPose extraction is not accurate, you can check whether the ref_pose.jpg is correct.
  2. There are arm problems in the target pose. Arms may be too long.
wangxiang1230 commented 2 weeks ago

Hi, I used the given reference image to generated videos:

https://github.com/ali-vilab/UniAnimate/assets/49615398/f190377e-0ad7-4cc0-83a1-8c204c0266c9

The reference image as the first frame:

https://github.com/ali-vilab/UniAnimate/assets/49615398/f9c600b9-323d-41df-b916-0e3e6edd9796

So, the dwpose of reference image is right. And you can check arm problems in the target pose.

zhanghongyong123456 commented 2 weeks ago

Hi, I used the given reference image to generated videos:

1.mp4 The reference image as the first frame:

2.mp4 So, the dwpose of reference image is right. And you can check arm problems in the target pose.

是的,我查看了openpose ,的确是有问题的, 0036

wangxiang1230 commented 2 weeks ago

Hi, I used the given reference image to generated videos: 1.mp4 The reference image as the first frame: 2.mp4 So, the dwpose of reference image is right. And you can check arm problems in the target pose.

是的,我查看了openpose ,的确是有问题的, 0036

Hi, the quality of target pose is important. You can use other good target pose sequence. In addition, our alignment strategy calculates the first frame of target pose when calculating the scale coefficient, so you need to ensure that the quality of the first frame of target pose is high. Here is another generated example:

https://github.com/ali-vilab/UniAnimate/assets/49615398/59a71a82-c34f-49d7-99c0-753c8248b8c8

The inconsistency between the source's face being crooked and the target's face facing forward may also lead to some unsatisfactory results. Thanks for your attention.