Open HoiM opened 2 years ago
Thanks for your great question! In general cases, each blend shape corresponds to some semantic meaning (smile, sad, etc.). But here we extend it such that the residual deformation is expressed as a linear combination of blend shapes, where each blend shape does not correspond to any semantic interpretation. I would say the "semantic" definition and how to learn them are somehow coupled: the blend shapes are an optimization target, such that the linear combination of them approximates the ground-truth best. For more details you might need to refer to the paper.
Hope it helps!
Thank you for your great work. After reading the paper, I still have some questions. So I hope you or anyone else can answer me if possible.
In the paper, the Residual Deformation Branch learns to predict blendshapes for each individual character. I'm wondering how these blendshapes are defined.
I'm not familiar with body blendshapes but as far as I know, blendshapes for facial expressions like jawOpen, eyeBlinkLeft, smileRight, etc., are semantically defined. In the paper[1], personalized facial blendshapes are learned via blendshape gradient loss function, which forces each of the generated blendshapes to have specific semantic meaning.
Another way to use blendshapes for facial expression is like what was done in MetaHuman[2] (Unreal Engine), where expressions are produced by bones. Blendshapes (called morph targets in Unreal Engine) are used to refine the face, which add more details. I think this is more similar to your work.
So I would like to know some details on your blendshapes: how they are defined, how they are learned, etc.
I really appreciate it if you could answer my questions.
Ref: [1] Personalized Face Modeling for Improved Face Reconstruction and Motion Retargeting [2] MetaHuman