facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
676 stars 92 forks source link

Function : transform_input_to_output_space #70

Closed linkinSimon closed 2 years ago

linkinSimon commented 2 years ago

Hi, @mks0601,

Thanks for sharing this amazing project !! May I ask some questions about the transform_input_to_output_space function?

  1. Why set bbox_3d_size = 400 and bbox_3d_size_root = 400 in (link src), what's the meaning of the two?

  2. Why preprocess the ground truth relative depth into this domain joint_coord[:,2] = (joint_coord[:,2] / (cfg.bbox_3d_size/2) + 1)/2. * cfg.output_hm_shape[0] (link src), is this good for converge?

Is there any explanation in this paper or this repository that I missed?

Really thanks for your time and help!!

mks0601 commented 2 years ago
  1. The size of a 3d box of a hand is 400 mm.
  2. To make it into a heatmap space. Thanks!
linkinSimon commented 2 years ago

Oh, I got it.

  1. bbox_3d_size = 400 and bbox_3d_size_root = 400 is assuming a hand relative depth is within 400mm if the root lies at the middle 200 mm.

  2. joint_coord[:,2] = (joint_coord[:,2] / (cfg.bbox_3d_size/2) + 1)/2. * cfg.output_hm_shape[0] link src is trying to map the relative depth to 0-64 space like the 2D heatmaps with the root joint lies at the middle 32.

A huge thanks for the explanation !!

ZYX-MLer commented 1 year ago
  • The size of a 3d box of a hand is 400 mm.
  • To make it into a heatmap space. Thanks!

How do you decide the value of bbox_3d_size? The largest value in the dataset? If i training the model using my own dataset, how to set bbox_3d_size?

thanks

Best wishes

mks0601 commented 1 year ago

Just based on prior knowledge about human scales. The value does not belong to specific datasets