oakink / OakInk

[CVPR 2022] OakInk: A Large-scale Knowledge Repository for Understanding Hand-Object Interaction
https://oakink.net
MIT License
94 stars 4 forks source link

hand annotation problems #4

Closed YorkWang-Go closed 1 year ago

YorkWang-Go commented 1 year ago

Dear authors,

Thanks for your awesome work and the dataset!

After learning about your work, I have some problems about hand annotation files and hope you can help me. Thanks a lot!

The file hand_param.pkl has fields hand_pose, hand_shape, hand_tsl and obj_transf. Can you please explain what hand_tsl and obj_transf stand for? What's more, if I want to use annotations from other datasets, how can I get the hand_tsl and obj_transf parameters, given joints coordinates, camera parameters, hand poses and hand shapes?

Looking forward to your reply!

Thank you!

lixiny commented 1 year ago

Hi, Thank you for acknowledging our work!.

The field: hand_tsl represents a 3x1 vector that translate the root-relative hand joints to the camera coordinates system. Usually, the hand vertices and joints yielded from ManoLayer is represented in a root-relative system, see ManoLayer here. Hence if you want to use hand vertices in camera space, you need this hand_tsl.

The field: obj_transf represets a 4x4 matrix of SE(3) transformation (including both rotation and translation) of object. Similarly, the object vertices loaded from disk is represented in its canonical space. we need to use the obj_transf to transform it to camera space.

If you want to use other dataset for hand-object interactions. You need to parse their annotations to seek a consistent representation in 3D space.

Here, I can share my data loaders for F-PHAB, HO3D and DexYCB datasets.

Hope these help!

Lixin

YorkWang-Go commented 1 year ago

Hi, Sincerely thanks for your reply a lot and these do help!

YorkWang-Go commented 1 year ago

Hi, I have a follow up question about using other datasets in Tink.

Now if I want to use other datasets like DexYCB to visualize hand-object interactions with Tink, is it workable to load the dataset using only the above data loaders? I noticed that in Tink, there was a pkl file called "raw_grasp", which contained four parameters: hand_pose(1X64), hand_shape(1X10), hand_tsl(1X3), obj_transf(4X4). I also noticed that in your data loader for DexYCB there was a function for getting obj_transf. So I wonder if I can get the corresponding four parameters using above data loader. I felt so confused about coordinates translation between Tink and other datasets like DexYCB, as I never got the right visualization results using the parameters in the DexYCB dataset. I thought it was due to the coordinate system.

Thanks for your help!

YorkWang-Go commented 1 year ago

Also I found that in the 'cal_contact_info.py' of Tink there was a function named "get_hand_parameter", which aimed to process the hand data in the dataset. I want to know whether the mano parameters of hand in other datasets also need to be processed by this function before hand, as I couldn't get reasonable results using DexYCB data when visualizing with Tink.

Sorry to have so many questions and thanks for your kindness!

lixiny commented 1 year ago

The coordinate systems used by DexYCB dataset and used by Tink are different.

For the image-based DexYCB and OakInk-Image datasets, we aimed to acquire hand pose (16 x 3), hand translation: (through get_joints_3d[center_idx, :]) (3 x 1), and object transform (4 x 4) in the camera's coordinate system, in which we can project the hand and object vertices back to image plane using intrinsic K.

For Tink, we aimed to represent the hand mesh in the object's canonical coordinate system. However, the raw data (raw_grasp)used by Tink is still represented in camera system. Hence, in the function: get_hand_param, we transform the hand pose and hand translation from camera space to the object canonical system.

Hope this helps!

Lixin