hongsukchoi / Pose2Mesh_RELEASE

Official Pytorch implementation of "Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose", ECCV 2020
MIT License
655 stars 69 forks source link

Queries in regards to shape parameters and results #69

Open jerrinbright opened 1 year ago

jerrinbright commented 1 year ago

Hey! Thank you for your work! I have few queries in regards to your work.

1.) If I understood correctly, your algorithm takes an input image, finds 2D pose using PoseNet and then sends the 2D pose to MeshNet which estimates the 3D mesh of the person. And from your paper, I understood that you wont use any shape parameters, rather MeshNet will automatically learn to fit the shape based on the 2D pose. Am I correct?

2.) I actually have the shape and pose (17+2 joints) of the person. Do you know of any approach which I can use to find the mesh with these 17+2 joints and shape parameters? Most of the algorithm that I looked into use more than 20 joint positions, so I am not able to use their models.

3.) When I tried Pose2Mesh, the shape tends to be really bad considering my inputs are low res, and have been induced with motion blur. I have attached an image from Pose2Mesh demo below: Screenshot from 2023-06-19 13-43-37

I am guessing the model is not able to fit in for 2D poses like this. Do you think giving the image along with the 2D pose as input help solve these problems with MeshNet and help generalizing better?

Let me know when you find the time. Thank you!

hongsukchoi commented 1 year ago

Hi

1) No. The 2D pose is estimated by off-the-shelve estimators like OpenPose and HRNet. Pose2Mesh itself does not use images as input. PoseNet of Pose2Mesh is lifting 2D pose to the 3D pose, and MeshNet is reconstructing a3D mesh from the 3D pose as shown in the paper. If you use a dataset like SURREAL that has data with sufficient shape variance, Pose2Mesh will recover a 3D shape to some extent, without using the explicit shape parameters.

2) Based on the image you show, I think you are using 19 COCO joints. SMPLify-x can optimize SMPL pose parameters to the 19 COCO joints, if you modify some code. And then you can use the resulting pose parameters with your shape parameters to decode a mesh. Unfortunately, as far as I know, there are not many approaches that reconstruct a 3D mesh from joints.

3) Since Pose2Mesh does not use any image as input, the poor result may be coming from the 2D pose. In my observation, the distance between hip joints matters and the input 2D pose shows a very narrow hip joint distance. It's a limitation of Pose2Mesh.

jerrinbright commented 1 year ago

Thank you so much for your reply!

Just some followups...

1.) Just to confirm, meshnet takes the concatenated 2D and 3D pose as input and sends it to a spectral graph convolution and it gives the 3D mesh as output?

2.) In regards to training, you have used groundtruth mesh generated using SMPLify-X by fitting SMPL parameters, right? And these parameters were generated using the SMPL parameters json file for COCO (mentioned in the readme file), right?

3.) Also, is there any particular reason for you not using the shape parameters? I was wondering because I am thinking that will help to generalise the shape of the mesh better.

Let me know when you find the time. Thank you!