facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
676 stars 92 forks source link

How to transfer MANO parameters from world coordinate system to camera coordinate system #43

Closed YuyanHuang closed 3 years ago

YuyanHuang commented 3 years ago

I need to use the MANO parameters in the camera coordinate system, and try to use the camera extrinsics to convert it to the camera coordinate system. But it failed, and the fitting error reached tens of millimeters. Part of the code is as follows:

        mano_pose = np.array(mano_param['pose']).reshape(-1,3)
        # mano_pose = np.dot(R, mano_pose.transpose(1,0)).transpose(1,0) + t.reshape(1,3)/1000  # (16,3)
        mano_pose = torch.FloatTensor(mano_pose)
        root_pose = mano_pose[0].view(1, 3)
        root_pose = np.dot(R, root_pose.transpose(1, 0)).transpose(1, 0) + t.reshape(1, 3) / 1000
        root_pose = torch.FloatTensor(root_pose)
        hand_pose = mano_pose[1:, :].contiguous().view(1, -1)
        shape = torch.FloatTensor(mano_param['shape']).view(1, -1)
        trans = np.array(mano_param['trans']).reshape(-1,3)
        trans = np.dot(R, trans.transpose(1,0)).transpose(1,0) + t.reshape(1,3)/1000
        trans = torch.FloatTensor(trans).view(1, -1)
        output = mano_layer[hand_type](global_orient=root_pose, hand_pose=hand_pose, betas=shape, transl=trans)
        mesh = output.vertices[0].numpy() * 1000
        fit_err = get_fitting_error(mesh, ih26m_joint_regressor, cam_params, joints, hand_type, capture_idx,frame_idx, cam_idx)
        print('Fitting error: ' + str(fit_err) + ' mm')
mks0601 commented 3 years ago

How R and t are defined? You should be aware that the root position is not (0,0,0). You should rotate the root joint position as well.

YuyanHuang commented 3 years ago

@mks0601 R and t are defined: t, R = np.array(cam_param['campos'][str(cam_idx)], dtype=np.float32).reshape(3), np.array(cam_param['camrot'][str(cam_idx)], dtype=np.float32).reshape(3,3) t = -np.dot(R,t.reshape(3,1)).reshape(3)

I haven't seen how the root position is set in render.py.

mks0601 commented 3 years ago

You are applying the extrinsics on the root_pose (rotation) and trans (translation). However, there is another translation vector, root joint position. When you set trans to zero vector, you can notice that the root joint position (output.joints[0]) is not zero. You should rotate the root joint position, as well, like here.