OpenMotionLab / MotionGPT

[NeurIPS 2023] MotionGPT: Human Motion as a Foreign Language, a unified motion-language generation model using LLMs
https://motion-gpt.github.io
MIT License
1.46k stars 91 forks source link

Joints of rendered mesh is different from the input joints #57

Open Steve-Tod opened 10 months ago

Steve-Tod commented 10 months ago

Hi there,

Thanks for making this great work public. I'm trying to generate human-object interaction (GRAB) motion based on your code. When I try to visualize the mesh of both human and object with your renderer, I found the object is misaligned, high up in the sky. I did some further debugging and found that your render process is: joint positions -> [IK] -> poses -> [SMPL] -> vertices. So I logged the joint positions from the SMPL model and found they are quite different from the input ones. Can you kindly help take a look at this and give some advice on debugging this? Thank you in advance!

Here's the code I used for debugging:

# joints is of shape [165, 22, 3]
data = joints
if len(data.shape) == 4:
    data = data[0]
# data = data - data[0, 0]
pose_generator = HybrIKJointsToRotmat()
pose = pose_generator(data)
pose = np.concatenate(
    [pose, np.stack([np.stack([np.eye(3)] * pose.shape[0], 0)] * 2, 1)], 1
)
shape = [768, 768]
render = SMPLRender(SMPL_MODEL_DIR)

r = RRR.from_rotvec(np.array([np.pi, 0.0, 0.0]))
pose[:, 0] = np.matmul(r.as_matrix().reshape(1, 3, 3), pose[:, 0])
vid = []
aroot = data[[0], 0]
aroot[:, 1] = -aroot[:, 1]
params = dict(pred_shape=np.zeros([1, 10]), pred_root=aroot, pred_pose=pose)
render.init_renderer([shape[0], shape[1], 3], params, obj_mesh, obj_transform)
# I set the smpl_output as a member of render
render_joints = render.smpl_output.joints.detach().cpu().numpy()[:, :22]
fig, ax = plt.subplots(2, 2, figsize=(10, 5))
# pelvis
ax[0, 0].plot(joints[:, 0])
ax[0, 1].plot(render_joints[:, 0])
# right wrist
ax[1, 0].plot(joints[:, 21])
ax[1, 1].plot(render_joints[:, 21])

And the result looks as follows. The pelvis is somehow fixed during the motion, and the wrist joint seems translated and reversed somehow?

output