KwaiVGI / LivePortrait

Bring portraits to life!
https://liveportrait.github.io
Other
10.86k stars 1.09k forks source link

whether the method proposed in this paper supports explicit key-point to drive source image? #83

Open Kanbo0409 opened 1 month ago

Kanbo0409 commented 1 month ago

since implicity key-points are used in this paper, what I'm concerned about is that explicit key-point, for example 106 2D landmark, can be use to drive source image?

cleardusk commented 1 month ago

Yeap, I think it supports. But 2D landmarks are ambiguous, especially with large poses.

cleardusk commented 1 month ago

Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.

johndpope commented 1 month ago

fyi - i recreated Samsung AI - megaportraits https://github.com/johndpope/MegaPortrait-hack - it has fast warping / doesn't use keypoints https://github.com/johndpope/MegaPortrait-hack/issues/36

Microsoft mention in their VASA paper - they use this ORIGINAL resnet50 implementation (as opposed to the more recent MetaPortrait) https://github.com/johndpope/MegaPortrait-hack/issues/16 I'm hoping EmoPortraits code drops this month - which will clear up some things.

Kanbo0409 commented 1 month ago

Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.

Thanks for your answer. I also get another questiones about formula (3) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?

zzzweakman commented 1 month ago

Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.

Thanks for your answer. I also get another questiones about formula (2) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?

In formula (2), the implicit keypoints are all 3D; there are no 2D landmarks included. Here, x_s and x_d represent the source and driving 3D implicit keypoints, respectively, while x_c,s denotes the canonical keypoints of the source image.

Kanbo0409 commented 1 month ago

Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.

Thanks for your answer. I also get another questiones about formula (2) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?

In formula (2), the implicit keypoints are all 3D; there are no 2D landmarks included. Here, x_s and x_d represent the source and driving 3D implicit keypoints, respectively, while x_c,s denotes the canonical keypoints of the source image.

Thanks for your reply. I‘m sorry, I made a spelling mistake. My questiones about formula (3), not formula (2). @zzzweakman