facebookresearch / InterHand2.6M

Official PyTorch implementation of "InterHand2.6M: A Dataset and Baseline for 3D Interacting Hand Pose Estimation from a Single RGB Image", ECCV 2020
Other
676 stars 92 forks source link

about Train/Test settings #62

Closed zqq-judy closed 3 years ago

zqq-judy commented 3 years ago

Hello! In the results of Table 3 in the paper (training set: SH+IH (ours), SH MPJPE: 12.16, IH MPJPE: 16.02), do you use all the images(1361K) in the train dataset, including gray images? And, testing in 849K images? Because I see gray images in the dataset. For example, the camera id starts with cam41. Thanks for your reply.

mks0601 commented 3 years ago

Yes. All images including gray images are used for the training and testing.

zqq-judy commented 3 years ago

On the test dataset (all images 5fps) of V1.0,I used your pre-trained model snapshot_20.pth.tar (5fps V0.0). But the result is not good.

MPJPE for all hand sequences: 78.63 MPJPE for single hand sequences: 88.65 MPJPE for interacting hand sequences: 69.90

Why is this? Is it because of the version of the dataset?

mks0601 commented 3 years ago

I think so.. I'll release a model trained on v1.0 soon.

mks0601 commented 3 years ago

Meanwhile, you can test on human_annot. The performance on human_annot is very similar to those of the paper.

zqq-judy commented 3 years ago

The results of human_annot(v1.0) are :

MPJPE for all hand sequences: 74.74 MPJPE for single hand sequences: 77.17 MPJPE for interacting hand sequences: 73.92

mks0601 commented 3 years ago

I think you have done something wrong. My human_annot performance (v1.0) is very similar to that of the paper. I just ran the testing. About 5 mins ago?

zqq-judy commented 3 years ago

I re-downloaded the code on GitHub, and the test dataset is human_annot (v1.0downloaded in April), the pre-trained model downloaded [Trained on InterHand2.6M 5 fps (v0.0)], and root_net downloaded [ Output on InterHand2.6M]. I only changed the path where the dataset was loaded, and the other codes were unchanged. But the result obtained is still the result of the above comment (MPJPE for all hand sequences: 74).

I am a little confused.

mks0601 commented 3 years ago

Could you change trans_test in main/config.py to gt and test again? This guy also got good results on human_annot. https://github.com/facebookresearch/InterHand2.6M/issues/59#issuecomment-849185485

mks0601 commented 3 years ago

I got the reason. the bboxs in rootnet results are not downsized :( Could you change this line to bbox = np.array(rootnet_result[str(aid)]['bbox'],dtype=np.float32) / 2.? I'll also update the rootnet results soon.

mks0601 commented 3 years ago

image

This is what I got when downsized the bbox following the above comment.

zqq-judy commented 3 years ago

Thank you for your patience to help me solve the problem. I have tested the results similar to those in the paper. Thank you very much.

mks0601 commented 3 years ago

You can download the rootnet results from here and use the boxs from them without downsizing, i.e., you do not have to change this line if you download the rootnet results again. Thanks!

mks0601 commented 3 years ago

The model trained on v1.0 is available in here

zqq-judy commented 3 years ago

Ok. Thanks.