youngwoo-yoon / Co-Speech_Gesture_Generation

This is an implementation of Robots learn social skills: End-to-end learning of co-speech gesture generation for humanoid robots.
https://sites.google.com/view/youngwoo-yoon/projects/co-speech-gesture-generation
Other
71 stars 9 forks source link

dim of output #18

Open zhewei-mt opened 1 year ago

zhewei-mt commented 1 year ago

Hello, I notice that the output of the model is 216, which is 18x12, and can be converted to 18x6 from rotation matrix to euler angle, where 18 is the number of joints. From my knowledge, for each joint, euler angle is a 1x3 vector and I am confused about the meaning of your ouput. Can you please provide more information about the meaning of the output, e.g. the origin of each joint, relative rotation respect to its parent joint or absolute rotaion? Thanks in advance!

youngwoo-yoon commented 1 year ago

Hello, I suspect you're using 24 joints (incl both upper and lower body parts). A rotation matrix has 9 values so that 24 x 9 is 216.

zhewei-mt commented 1 year ago

I double checked the code in "inference.py". Below is line 152: out_poses = out_poses.reshape((out_poses.shape[0], -1, 12)) # (n_frames, n_joints, 12) It reshapes the output to (n_frames, n_joints, 12), which gives n_joints to be 18. I don't understand why it's of shape 18x12.

youngwoo-yoon commented 1 year ago

Thanks for pointing that out. I'll take a look.

youngwoo-yoon commented 1 year ago

Sorry for the confusion. 3 for x, y, z positions and 9 for a rotation matrix, so 12 in total. Please refer to this: https://github.com/youngwoo-yoon/Co-Speech_Gesture_Generation/blob/46671fc6af8a3dbcb93cdafca4f07bc064a2c3b1/scripts/twh_dataset_to_lmdb.py#L46

zhewei-mt commented 1 year ago

Thanks for clarify. Do all joints share the same origin? Is there any way to convert the ratation information to be compatiable SMPL?

youngwoo-yoon commented 1 year ago

Do all joints share the same origin?

The answer is yes if you're asking for the positions because all the joints are in the same coordinate system. Rotations are local rotations that represent rotation to the parent joint.

Is there any way to convert the ratation information to be compatiable SMPL?

No, as far as I know. It is not a simple procedure because the dataset only has poses, not shape.

zhewei-mt commented 1 year ago

Thanks for your reply. My bad. I am not asking for shape parameter of SMPL. My purpose is to drive avatar in UE5. I expect that I should do some preprocess so that UE is able to "recognize" your output. Do you have any expertise in this field?

youngwoo-yoon commented 1 year ago

I usually used Blender for the animation after putting the output values into BVH file. This might helps you: https://github.com/TeoNikolov/genea_visualizer Unfortunately I do not have experiences with UE5.

zhewei-mt commented 1 year ago

Thanks! Two more questions.

  1. For 18 joints setting, is there any picture to visualize all joints positions? I dont understand the difference between 'b_l_wrist_twist' and 'b_l_wrist', and visualization helps a lot, if any.
  2. What is the initial body pose, e.g. a-pose or t-pose?
youngwoo-yoon commented 1 year ago
  1. When you import a BVH file in Blender, you can visually check joints and their names.
  2. T-pose. Some technical details for the retargeting to T-pose are in https://arxiv.org/pdf/2303.08737.pdf
zhewei-mt commented 1 year ago

For coordinate system of each joint, if I get it correct, when facing out of the screen, the direction would be like this: -y up and z forward 9B9F5072-20A3-4b3c-BFD4-4CA2C7E27F45 Please correct me if I am wrong. Thanks!