mks0601 / Hand4Whole_RELEASE

Official PyTorch implementation of "Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation", CVPRW 2022 (Oral.)
MIT License
314 stars 31 forks source link

About process_db_coord #75

Closed 124518 closed 1 year ago

124518 commented 1 year ago

hellow,your work is really good and I have a question about pre-process data.https://github.com/mks0601/Hand4Whole_RELEASE/blob/afcbdf448e5ceeaad6da33776b68c496b4a3edde/data/Human36M/Human36M.py#L149 before this code,you have got the joint_img, joint_cam from the josn,and this line process the joint_img and joint_cam again.why did this and will this action would broken the Original data.

mks0601 commented 1 year ago

You can read this function why I do that https://github.com/mks0601/Hand4Whole_RELEASE/blob/afcbdf448e5ceeaad6da33776b68c496b4a3edde/common/utils/preprocessing.py#L163

124518 commented 1 year ago

Thank you very much for your answer!And I have another question,in your work, you got tz through the below function, but other ways got tz through 2f/(res *s),which is weak perspective, I want to know the difference between them, Thank you very much if you can answer me. https://github.com/mks0601/Hand4Whole_RELEASE/blob/afcbdf448e5ceeaad6da33776b68c496b4a3edde/main/model.py#L36

mks0601 commented 1 year ago

what do res and s mean in 2f/(res * s)?

124518 commented 1 year ago

f = 5000 , denotes predefined focal length, s is the scale parameter, in your codes, I think it is cam_param[:,2],but I am not sure.res denotes the resolution of the resized crop image.

mks0601 commented 1 year ago

I do not think there is a significant difference.

124518 commented 1 year ago

Thank you for your reply, but I can't understant k_value, why calculat k_valueb in this formula. https://github.com/mks0601/Hand4Whole_RELEASE/blob/afcbdf448e5ceeaad6da33776b68c496b4a3edde/main/model.py#L40

124518 commented 1 year ago

And I note that You also use output_hm_shape when normalizing,What role does output_hm_shape play here?Looking forward to your answer.

mks0601 commented 1 year ago

Please Sec 1 of suppl of https://arxiv.org/pdf/1907.11346.pdf for the k value As you said, it is just for normalizing coordinates

124518 commented 1 year ago

OK.Thank you very much for your reply!

124518 commented 1 year ago

Hellow, I read your paper carefully and have two questions I would like to ask you.In the below formula, what are the values of f and px, and do you assume that f is 5000(pixel)?Looking forward to your reply. image

mks0601 commented 1 year ago

Hi, as in-the-wild images do not provide focal lengths, I assume focal length as 5000, following previous works. Actually, alpha_x is what we need, which is a focal length in pixel unit, and that is assumes to be 5000.

124518 commented 1 year ago

Hello,in the paper, the actual area of the human body is preset as 2*2(m), but the preset value in your code is 2.5, why is this? Is this related to the detected body frame?Is gamma the correction factor in the paper?Thank you very much for your reply! image

mks0601 commented 1 year ago

It doesn't matter. The k_value would be proportionally increased.

124518 commented 1 year ago

OK.Thank you very much!