Open dmetehan opened 3 years ago
Good question! 1.About the 3D-to-2D projection. It is presented at here. The detailed function is at here. To get the projected 2D coordinates, just use 'pj2d' at here. They are normalized (-1~1) coordinates on the input image (not the original image).
Don't hestiate to let me your question. Best.
Thank you very much for the detailed info. Is there a way to map j3d_smpl24 onto the image space?
Of course. It is very simple. Add a line after this line,
pj2d_smpl24 = proj.batch_orth_proj(j3d_smpl24, params_dict['cam'], mode='2d')[:,:,:2]
'pj2d_smpl24' is what you want. Please add it to the 'outputs' dict for further use.
@Arthur151 Hi dear Yu Sun. I'm also trying to get SMPL 3D keypoints in image space. I followed your instructions and I got 2D data and 45 entries. I used mode=3d
but still I'm getting 2D data. I'm writing it as example below.
[[0.0744452346641, -0.012984120034], [-0.23053123445656, 0.00345313533],.... upto 45 entries]
I want to ask that how to get 3D keypoints in image space?
My other question is, as you mentioned above reply that in SMPL space body center is pelvis. So what about body center in image space?
Finally I want to ask that is it any possible way to get SPIN keypoints from your repository?
Good question! 1.About the 3D-to-2D projection. It is presented at here. The detailed function is at here. To get the projected 2D coordinates, just use 'pj2d' at here. They are normalized (-1~1) coordinates on the input image (not the original image).
- About 'cam'. We adopt a 3-dim camera parameter. We don't directly use the estimated scale value. Instead, we take the $(1.1)^scale$ to make sure that the scale value is always positive. These camera parameters are used to project the estimated 3D joints or body vertex back to 2D image via weak-perspective projection.
- About 'pose': they are 72-dim SMPL pose parameters, which are 3-dim 3D rotation of 24 SMPL joints 'j3d_smpl24' in axis representation.
- About 'j3d_smpl24' or 'verts'. They are in standard SMPL space, not image space. In standard SMPL space, the body center (near pelvis) is located at the origin.
Don't hestiate to let me your question. Best.
Thanks for your reply. In this reply in point 4 you mentioned that In standard SMPL space, the body center (near pelvis) is located at the origin. I was asking about this body center.
Furthermore I changed keep_dim=True
but still I'm getting same results as I mentioned above.
Thanks for clarification. Now I can get x, y and z data by removing [:,:,:2]
but there is one confusion that there is 45 entries array size is (1, 45, 3)
for one frame but I think it should be (1, 24, 3)
.
To get the back-projected 3D joints in SMPL format, I recommand to replace this line, which is
pj3d = proj.batch_orth_proj(j3d_op25, params_dict['cam'], mode='2d')
with
pj3d = proj.batch_orth_proj( j3d_smpl24, params_dict['cam'], keep_dim=True)
@Arthur151 Thanks for your help. I'm following your suggestions but still I'm getting (1, 45, 3)
. In-fact I can slice array to get first 24 entries as you also did here in j3d_smpl24[:,:24]
.
But my question is that why we are getting 45 entries? I'm unable to get it.
These 3D body joints are derived from the estimated SMPL body mesh. Each body mesh contains 6890 vertices, from where we can regress the 24 body joints you need. The regression process is quite simple. Because these vertices have stable semantic location, each joint could be easily regressed from them. For instance, if we want to calculate the elbow keypoints, we make average on the vertices near the elbow. In this way, theoritically, we can derive any keypoint you want from the body mesh. There might be some redundant keypoints we regressed from the SMPL body mesh, which makes it 45 instead of 24.
I would like to visualize the body landmarks projected onto the 2d image however I have some problems understanding how the values are stored. When I run the demo script I get a file called .npz and I can load it with numpy. Inside it I found f->results where two dictionaries are stored one for each person in the image. Inside each dictionary I can see 'cam' (3,), 'pose' (72,), 'betas' (10,), 'j3d_smpl24' (24, 3), 'j3d_op25' (25, 3), and 'verts' (6890, 3) variables (I put their corresponding shapes in parentheses). From your paper I thought 'pose' variable would be of length 132 (22 landmarks x 6D). Anyway, I assume you have saved 24 landmarks in 3D which makes an array of length 72. From the 'cam' variable (tx, ty = cam[1:]) I calculated the center for each people trivially since those numbers are normalized between -1 and 1 just as you mentioned in your paper. However when it comes to visualizing 'pose', 'j3d_smpl24' or 'verts' variables I ran into some problems. Could you explain how each of these variables store their data? Apparently they are not normalized between -1 and 1. Also I'm having some problems understanding the first number in 'cam' variable which should correspond to scale according to the paper. In the paper it is said that this variable "reflects the size and depth of the human body to some extent.". How can this scale be used to visualize pose points? In my example I get 5.51 for one person and -5.156 for the other. Would you also explain what does a negative scale represent?