hxwork / H2ONet_Pytorch

[CVPR 2023] H2ONet: Hand-Occlusion-and-Orientation-aware Network for Real-time 3D Hand Mesh Reconstruction, Pytorch implementation.
MIT License
58 stars 1 forks source link

About the 3D coordinates of the key points of the hand #6

Closed KennyBlocker closed 7 months ago

KennyBlocker commented 7 months ago

Hello, I am not familiar with hand pose estimation. What I would like to ask is whether it is possible to use your method to use a single RGB image combined with internal camera parameters to obtain the 3D coordinates of the key points of the hand in the camera coordinate system? thank you.

hxwork commented 7 months ago

You may also need the 3D coordinates of the root joint in the camera coordinate system since our method does not contain the 3D position estimation of the root joint. You may find the GitHub repo of the I2L-MeshNet to see how a 3D hand-mesh reconstruction method be used to test on the in-the-wild images. Also, the RootNet may be useful for you.

KennyBlocker commented 7 months ago

You may also need the 3D coordinates of the root joint in the camera coordinate system since our method does not contain the 3D position estimation of the root joint. You may find the GitHub repo of the I2L-MeshNet to see how a 3D hand-mesh reconstruction method be used to test on the in-the-wild images. Also, the RootNet may be useful for you.

Thanks for your reply! It's really helpful

KennyBlocker commented 7 months ago

You may also need the 3D coordinates of the root joint in the camera coordinate system since our method does not contain the 3D position estimation of the root joint. You may find the GitHub repo of the I2L-MeshNet to see how a 3D hand-mesh reconstruction method be used to test on the in-the-wild images. Also, the RootNet may be useful for you.

One more question. I'm considering transforming the estimated hand pose into the world coordinate system. Suppose I use a depth camera, thus obtaining the actual 3D coordinates of some keypoints in the camera coordinate system (while other keypoints are occluded by objects and cannot be obtained). Can I obtain the complete 3D coordinates of the estimated hand pose in the camera coordinate system by simply translating the estimated coordinates by the difference between the actual 3D coordinates obtained by the depth camera for the partially visible keypoints and the estimated coordinates (in the root coordinate system)?

hxwork commented 7 months ago

My answer may not be helpful since this question is too detailed and related to your own research. First, if you want to transfer the coordinates from the camera coordinate system to the world coordinate system, the camera's extrinsic parameters are required. Besides, the premise of your idea is that the estimated hand pose of the visible parts has to be accurate enough to align the corresponding parts captured by the depth camera, otherwise, the transformation will be incorrect and directly cause bad cases.

KennyBlocker commented 7 months ago

The intrinsic and extrinsic parameters of the camera are known, and the precise coordinates of the visible parts of the hand in the camera coordinate system can be obtained. I think this method works. Many thanks!