Walter0807 / MotionBERT

[ICCV 2023] PyTorch Implementation of "MotionBERT: A Unified Perspective on Learning Human Motion Representations"
Apache License 2.0
1.06k stars 131 forks source link

about the root trans in 3d pose and mesh #25

Closed lucasjinreal closed 1 year ago

lucasjinreal commented 1 year ago

Hello, this is really an impressive work, clean and fantastic. however, I got some questions wanna ask for help:

  1. From the demo videos, the translations in 3d keypoints task relatively stable, but mesh looks like bumppy in vertical direction (height), is there a way to make use of all of their strength to make a multi-modality like model to make mesh trans more accurate?
  2. Since looks like 3d pose result really good, but most situations we need rotations of the body (like SMPL), is there a way could get the rotations like SMPL from 3d pose directly?
Walter0807 commented 1 year ago

Hi, thanks for your interest in our work.

  1. We provide a tool to use the root trajectory from the 3D pose estimation model for the global translation of mesh results (https://github.com/Walter0807/MotionBERT/blob/main/docs/inference.md#mesh), which might be of your interest. Better techniques can also be explored in the future.
  2. There are some tools (e.g. Minimal-IK) that support fitting 3D pose to SMPL models. More advanced techniques like HybrIK could also be explored.

Hope this helps your questions!

lucasjinreal commented 1 year ago

@Walter0807 thank u so much for the info. I found an interesting result that even with window size as 24 clip length, MB can get a proper result as well, this is really impressive (I haven't visualize in real 3d to see the biasis in novel view yet). In such case, MB actually can be used as an side trans predictor along with other mesh prediction method, just like you did demo but in REALTIME.

But this makes me think further, since the inputs and model are same, why they can not learn jointly and then directly prediction smpl beta && theta && 3d trans in a couple-head, have u think it before?

BTW, 3d joints to smpl rotations in MinimalIK are optimization based, do u know is there a learned way to do so?

Walter0807 commented 1 year ago

Thanks for your comments.

But this makes me think further, since the inputs and model are same, why they can not learn jointly and then directly prediction smpl beta && theta && 3d trans in a couple-head, have u think it before?

Sounds like a good idea to regress the global translation in pixel space when regressing SMPL parameters as well. We haven't tried this as previous methods usually don't do so.

BTW, 3d joints to smpl rotations in MinimalIK are optimization based, do u know is there a learned way to do so?

Maybe you can check Pose2Mesh.