How to calculate the global root (or camera) 3D translation?

mks0601 / Hand4Whole_RELEASE

Official PyTorch implementation of "Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation", CVPRW 2022 (Oral.)

MIT License

314 stars 31 forks source link

How to calculate the global root (or camera) 3D translation? #50

Open RockyXu66 opened 2 years ago

RockyXu66 commented 2 years ago

Hi. I have two questions.

There are only cam_trans in the output in demo.py. It's a little confused with focal length (with what unit), camera_3d_size(why you use this) in config.py file. How can I get the body's global root (or you can say camera) 3D translation in the real world with meter unit.
I find that cam_param is the output of body_rotation_net and cam_trans is calculated by get_camera_trans function. Could you explain what they actually mean?

mks0601 commented 2 years ago

Please look at Fig.6 and Section 1 of the suppl. The 3D global translation d can be calculated from focal lengths in x- and y-axis (unit: pixel), A_real (xy area of real human. meter x meter), and A_image (xy area of human in the image. unit: pixel x pixel). (x,y) of cam_param are directly used to get cam_trans, which means they are 3D global translation of (x,y). Directly calculating z of cam_trans could be ambiguous. Hence, we define initial z of the 3D global translation and refine it following Fig. 5 of the paper. camera_3d_size is to calculate A_real.

linjing7 commented 1 year ago

Hi, @mks0601, which paper do you mean? I do not find related information in the supplementary of Hand4Whole.

mks0601 commented 1 year ago

Sorry I forgot to add link. https://arxiv.org/abs/1907.11346

linjing7 commented 1 year ago

Okay, thank you very mucn.