facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
686 stars 91 forks source link

how to convert 2.5D coordinate to real-world 3D coordinate? #115

Closed Joyako closed 2 years ago

Joyako commented 2 years ago

Hi, thanks for your excellent project! Q1: What coordinate system is the output of the model directly? Q2: I found in your paper 4.4 how the 3D coordinates of the hand are calculated as fllows:

截屏2022-05-11 上午10 18 34

but I can not understand that what the meaning of camera back-projection and inverse affine transformation and how to calculate it in your code. Looking forward to your reply, thanks.

mks0601 commented 2 years ago

Q1. x,y: pixel (0~63, 0~63), z: depth (normalized to 0~64) Q2. camera back-projection: (x_img, y_img, z_real) -> (x_real, y_real, z_real). inverse affine transformation: cropped and resized hand image space -> original image space before cropping and resizing

Joyako commented 2 years ago

@mks0601 thanks a lot, camera back-projection: (x_img, y_img, z_real) -> (x_real, y_real, z_real) Assuming that the camera internal parameters are know , it can be solved by the following formula, right? z_real = z_real x_real = (x_img - cx) z_real / fx y_real = (y_img - cy) z_real / fy

camera internal parameters: K = [[fx, 0, cx], [0, fy, cy], [0, 0, 1]]

mks0601 commented 2 years ago

yes. that function is implemented in utils.transformations.pixel2cam

Joyako commented 2 years ago

thanks, I will close it.