ai4r / Gesture-Generation-from-Trimodal-Context

Speech Gesture Generation from the Trimodal Context of Text, Audio, and Speaker Identity (SIGGRAPH Asia 2020)
Other
245 stars 35 forks source link

How to convert output? #48

Closed zhewei-mt closed 1 year ago

zhewei-mt commented 1 year ago

Hello, Thank for your great work! I have several questions.

  1. I notice the physical meaning of output is the direction vector with the spine (or pelvis) as the origin, which is not compatible with the existing rendering engine such as blender and UE5. Is there any way how to convert the output to axis angle or rotation matrix of which UE5 can make use?
  2. I know the joint name of the ten joint, but what is the exact mapping relation? Like index 0 is the spine, index 9 is the right wrist.
  3. Currently the movement of hand, e.g. thumb, pinky and so on is not supported, do you have any plan to support or do you know any other paper that support driving of hand? Thanks in advance!!!
youngwoo-yoon commented 1 year ago

Hello,

  1. That is true and there is no easy way to convert directional vectors to rendering-engine-compatible values. You can try inverse kinematics or manual computing of joint rotations.
  2. Please see this https://github.com/ai4r/Gesture-Generation-from-Trimodal-Context/issues/46
  3. Talking with Hands and BEAT papers handle finger motion. In addition, the GENEA challenge 2022 used Talking with Hands dataset and you can find the resource and system papers here: https://youngwoo-yoon.github.io/GENEAchallenge2022/
zhewei-mt commented 1 year ago

Thanks for your reply. For problem 1, do you have any hint how to achieve that?

youngwoo-yoon commented 1 year ago

Not exactly the same, but I did a similar thing in Babylon.js. I calculated joint rotation from directional vectors. Please see: https://github.com/ai4r/SGToolkit/blob/0c11a54ee64a7b29f4cba79a9d9ff3ae48e3af4e/static/js/avatar.js#L175