akanazawa / hmr

Project page for End-to-end Recovery of Human Shape and Pose
Other
1.54k stars 395 forks source link

3d joints label in the tfrecord #116

Closed xjturobocon closed 4 years ago

xjturobocon commented 4 years ago

Hey, I am confused in the features['mosh/gt3d'] in the H36M tfrecord. I noticed that this gt3d is not equal to the 3d joints coordinates generated by SMPL model given the ground truth pose and shape params. I am confused that here why we don't just use the latter as the ground truth 3d joints position? When computing loss, smpl loss and 3d joints loss are computed meanwhile, however, even if using ground truth smpl params, we can't get ground truth 3d joints position. So I think this Inconsistency will decay the perfomance of the model.

By the way, can you explain how you get features['mosh/gt3d']? I compared it with its correspondence in the camera coordinate system in origin dataset, I found they differ in a transition vector. What is the meaning of that transition? Looking forward to your reply

Thanks.

akanazawa commented 4 years ago

Hi,

Good question. My memory is fading,, but I think this is because the mosh/gt3d is not the 3D joints of the SMPL skeleton, but 3D joints of defined in Human3.6M. A more appropriate naming was to put them under h36m/gt3d, but abused that term mosh to mean any gt with 3D. The reason why we don't use latter is only because to align ourselves to the Human3.6M evaluation protocol. As you may have noticed, the 3D joints in SMPL skeleton is quite different from that of Human3.6M, so it would not be an apple to apple comparison, and as we evaluate on this dataset we just used their ground truth 3D joints as the target.

The mosh/gt3d is obtained from the original Human3.6M dataset's annotation.

Hope this helps and sorry for the late response!

A