Closed Shubhendu-Jena closed 3 years ago
The x- and y-axis of the positional pose in Pose2Pose are defined in image space. Therefore, there the positional pose can be used to extract a feature vector on the feature map.
Thank you for the prompt response. However, there are a few things that are still unclear to me. For the x- and y-axis of the positional pose in Pose2Pose to be defined in image space, they have to be supervised by ground truth which is also in the image space. In the annotation you've provided with I2l-Meshnet, is targets{'fit_joint_img'} aligned with the image space and hence, could I obtain aggregated joint features, in the same way, using bilinear interpolation? Hope the above statements make sense.
Yes. targets{'fit_joint_img'} is aligned with the image space. You can visualize it for the debugging purpose.
Thank you for the response. I tried to shade the image area corresponding to targets{'fit_mesh_img'} and got this : For some reason, the ground truth mesh seems to be inverted. I'd be grateful if you could help me understand why this is the case.
There is no visualization problem on my side. Could you provide your mesh visualization code?
I didn't do mesh visualization as such. I just took the x and y coordinates from targets{'fit_mesh_img'} and assigned the corresponding locations in the input image after augmentation 0. The saved image then looks like above. The purpose of this was to verify if the x and y coordinates of targets{'fit_mesh_img'} are aligned with the augmented image.
Please see here. Current code can overlay fit_mesh_img to image correctly. Please see demo.py.
Hi. I tried projecting the mesh the way it has been done in demo.py. I get the results as follows :
Is there still a problem or is the projection meant to be slightly off?
Please run the demo code and check the visualized mesh is correct. If the demo code runs correct, then, your mesh projecting or any other function seems not working properly.
Yes, I figured out the mistake. Apologies for so many questions and thank you for the prompt responses. Closing issue now
@Shubhendu-Jena can you please share your code? I am trying to visualize the groundtruth on the images. This seems to very off.
mesh_lixel_img = targets['fit_mesh_img'][0].cpu().numpy() ## 6890 x 3
# restore mesh_lixel_img to original image space and continuous depth space
mesh_lixel_img[:,0] = mesh_lixel_img[:,0] / cfg.output_hm_shape[2] * cfg.input_img_shape[1]
mesh_lixel_img[:,1] = mesh_lixel_img[:,1] / cfg.output_hm_shape[1] * cfg.input_img_shape[0]
mesh_lixel_img[:,2] = (mesh_lixel_img[:,2] / cfg.output_hm_shape[0] * 2. - 1) * (cfg.bbox_3d_size / 2)
raw_img = (255*img.copy())[...,::-1]
mesh_img = vis_mesh(raw_img.copy(), mesh_lixel_img)
EDIT: This works for full body humans but not for crops (zoomed in images). I guess this by design.
Hi again,
Thanks for your work. I wanted to project the mesh onto the image. However, since the ground truth meshes (targets['fit_mesh_cam']) are root-relative, I noticed that in "demo.py", you use the root depth obtained from rootnet to make sure the projected mesh aligns with the image. In your other work Pose2Pose, for Positional pose-guided pooling, you obtain the joint features using bilinear interpolation on the image feature map. For doing this, the predicted joints have to be aligned with the image. How did you manage this? I am essentially trying to project the ground truth meshes onto the image without using the root depth from rootnet. Is this possible?
Thanks in advance