QitaoZhao / ContextAware-PoseFormer

The project is an official implementation of our paper "A Single 2D Pose With Context is Worth Hundreds for 3D Human Pose Estimation".
66 stars 4 forks source link

Questions about HRNet detected keypoints 2d in the provided files "h36m_train.pkl" and "h36m_validation.pkl" #4

Closed wangzhidi closed 9 months ago

wangzhidi commented 10 months ago

Thanks for your contributions to the 3D HPE! Could you please tell me how you get the "joints_2d_hrnet" in the "h36m_train.pkl" and "h36m_validation.pkl"? Are there any codes I can refer to?

QitaoZhao commented 10 months ago

Please refer to https://github.com/QitaoZhao/ContextAware-PoseFormer?tab=readme-ov-file#dataset-preparation for the code to prepare the data. Also, check here (link), where I attached my pre-processed data if you don't want to run the code yourself.

zerowing-ex commented 10 months ago

I have the same issue, and I'm also curious about how you obtained the 2D keypoints detected by HRNet in 'joints_2d_hrnet' in h36m_train/validation.pkl. There doesn't seem to be relevant content in your code. For example, if you have an image from Human3.6M, it is usually cropped to a suitable size (288*384). The shape should be (1,3,384,288) when input to HRNetw32/48, and the output heatmaps have a shape of (1,17,96,72). Using these heatmaps, you can estimate the 2D pose, but the estimated 2D keypoints are in COCO format, which is different from the format in Human3.6M. How did you address this issue? Even after converting it to the Human3.6M format, there seems to be a significant difference from the 2D keypoints you provided. image Note: the green points are 2D ground truth the red points are predicated by the HRNet the orange points are provided HRNet keypoints in the h36m_train/validation.pkl

the green rect is the bbox provided in the h36m_train/validation.pkl the red rect is the bbox of cropped image (288*384)

zerowing-ex commented 10 months ago

I have addressed this problem, thanks for your excellent works again! 003C03A5

QitaoZhao commented 9 months ago

Sorry for the late reply. We do not use the 2D keypoint from HRNet (we only use the backbone) to report our results. joints_2d_hrnet was used for our initial experiments. We do not process HRNet ourselves and only use the pre-processed keypoints from other repos. Currently, I find it hard to provide the source, but I will post it here as soon as I see it.

zerowing-ex commented 9 months ago

Thanks!

QitaoZhao commented 9 months ago

@zerowing-ex @wangzhidi Please look at https://github.com/Nicholasli1995/EvoSkeleton/blob/master/docs/TRAINING.md, in my best impression, our pre-processed HRNet keypoints were adopted from twoDPose_HRN_train.npy and twoDPose_HRN_test.npy.

glee623 commented 6 months ago

Thank you for your awesome work!

I thought the output keypoint from the HRNet was used to get the Pose Feature P. But at the code, you guys are using 'joints_2d_cpn' for the coordinate embedding. Also, you said you did not use the 2D keypoint from HRNet.

Would you explain more about this? Thanks.

zerowing-ex commented 6 months ago

Yes, you're right.