akanazawa / hmr

Project page for End-to-end Recovery of Human Shape and Pose
Other
1.55k stars 391 forks source link

3D Pose Performance on Human3.6M #44

Closed Yuliang-Zou closed 6 years ago

Yuliang-Zou commented 6 years ago

Hi @akanazawa , I wonder how you got the numbers in table 1 and table 2.

I used the ground truth bounding box provided by Human3.6M to crop the person, and I used the util code to scale it and do padding. Then I passed it to the network with the provided pre-trained model. But the mean reconstruction error is around ~77mm (in Protocol 2).

I wonder what could be wrong. I only use the video sequence from camera 60457274, and I evaluate 1 frame from every 5 frames.

Thanks.

akanazawa commented 6 years ago

Hi,

I have the entire evaluation code available in the repo that reproduces the Table 2 numbers with the public model (there is a slight difference because this is a re-trained model from this repo. MPJPE-PA is slightly higher, but MPJPE is lower). See the instructions [here].(https://github.com/akanazawa/hmr/blob/master/doc/train.md#evaluation)

I suspect the issue is due to differences in preprocessing. AFAIK H3.6M doesn't give you bounding box, I had to estimate it from the 2D keypoints, perhaps this source you get the bounding box treats things differently (i.e. too much margin etc)

Best,

Angjoo

Yuliang-Zou commented 5 years ago

I actually used the provided ground truth mask to generate a tight bounding box, and then I scaled the image so that the long edge of the box to be 150px (basically used the demo code, but I replace the json input with the corresponding center and scale parameters).

I think this box should be tight (I indeed visualized the validate this point), but the MP-MPJPE (~77) is still far from the reported number.