ChenFengYe / motion-latent-diffusion

[CVPR 2023] Executing your Commands via Motion Diffusion in Latent Space, a fast and high-quality motion diffusion model
https://chenxin.tech/mld/
MIT License
540 stars 48 forks source link

Exporting motions to bvh #15

Open aplatyps opened 1 year ago

aplatyps commented 1 year ago

Hi Chen Xin, love your work. I'm currently learning about 3D motions (very new to this, might ask silly questions) and came across your paper and GitHub after some research. I generated some motions with your pre-trained model with the HumanML3D dataset. There's 22 joints. If I understood your paper correctly, the motion generated is in MMM format, and you have a list of joint names arranged in a way that's corresponding to the index of the generated motion. I'm currently trying to write a script to export the motions to a .bvh file so it's easier to use. Do you know of any existing methods/tools I can use or if I need to do any transformation on the motion? Many thanks :)

ChenFengYe commented 1 year ago

Welcome to motion generation! The generated motion is not in MMM format (21 joints). It is AMASS format (22 joints). Please refer to the below. https://github.com/ChenFengYe/motion-latent-diffusion/blob/7db8623e61e2b6818e724a0a9a1fc999f5054327/mld/transforms/joints2rots/config.py#L71

If you want to export .bvh file, we suggest you fit smpl motion parameters (root position+joint rotation). You can check this part in github page readme. Then, export smpl motion parameters to bvh file. There are many scripts, e.g. https://github.com/KosukeFukazawa/smpl2bvh You can also find more related tools.

aplatyps commented 1 year ago

Thanks for the pointers, it was very helpful! I see that you have 3 params in the exported smpl .pkl https://github.com/ChenFengYe/motion-latent-diffusion/blob/d6c5ca74af2cf6be4523dc7130d1675b684f595c/fit.py#L272-L274

poses is the rotation vectors, I assume the cam is the global translation. Where can I find the smpl scaling?

ChenFengYe commented 1 year ago

I am not sure what in "cam". Maybe zeros? "pose" should be 72 (3 root global translation + 3*23 local rotation with 23 joints). No scaling here, maybe "cam" is a kind of scales?