Open Kanbo0409 opened 1 month ago
Yeap, I think it supports. But 2D landmarks are ambiguous, especially with large poses.
Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.
fyi - i recreated Samsung AI - megaportraits https://github.com/johndpope/MegaPortrait-hack - it has fast warping / doesn't use keypoints https://github.com/johndpope/MegaPortrait-hack/issues/36
Microsoft mention in their VASA paper - they use this ORIGINAL resnet50 implementation (as opposed to the more recent MetaPortrait) https://github.com/johndpope/MegaPortrait-hack/issues/16 I'm hoping EmoPortraits code drops this month - which will clear up some things.
Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.
Thanks for your answer. I also get another questiones about formula (3) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?
Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.
Thanks for your answer. I also get another questiones about formula (2) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?
In formula (2), the implicit keypoints are all 3D; there are no 2D landmarks included. Here, x_s and x_d represent the source and driving 3D implicit keypoints, respectively, while x_c,s denotes the canonical keypoints of the source image.
Many diffusion-based methods use 2D landmarks or 3D-to-2D projected landmarks as the condition to animate.
Thanks for your answer. I also get another questiones about formula (2) that mentioned in paper. 1) what are the indexes of 2d landmarks and implicit keypoints in formula (2). 2)the 2d landmarks are extracted from source image、drive image, or both of them?
In formula (2), the implicit keypoints are all 3D; there are no 2D landmarks included. Here, x_s and x_d represent the source and driving 3D implicit keypoints, respectively, while x_c,s denotes the canonical keypoints of the source image.
Thanks for your reply. I‘m sorry, I made a spelling mistake. My questiones about formula (3), not formula (2). @zzzweakman
since implicity key-points are used in this paper, what I'm concerned about is that explicit key-point, for example 106 2D landmark, can be use to drive source image?