Closed Shubhendu-Jena closed 3 years ago
_img
: (x,y) in pixel space, z: root-relative discretized depth space
_cam
: (x,y,z) in root-relative 3D coordinates
Thank you for the response. I understood it. You have also mentioned this in your paper in section 3.3 Final 3D human pose and mesh. However, I am finding it a bit difficult to understand why the lixel stage can't be directly supervised using _cam: (x,y,z) in root-relative 3D coordinates. I'd be grateful if you could help me understand.
Also, I noticed that for the param stage, root-relative 3D coordinates have already been regressed by the following - root_joint_cam = joint_coord_cam[:,self.root_joint_idx,None,:] mesh_coord_cam = mesh_coord_cam - root_joint_cam joint_coord_cam = joint_coord_cam - root_joint_cam
However, in the evaluate function, the following has been done again - pose_coord_out_h36m = pose_coord_out_h36m - pose_coord_out_h36m[self.h36m_root_joint_idx,None] # root-relative
Isn't this redundant? If not, could you please tell me why it has been done?
Q1. Why lixel stage can't be directly supervised using _cam? A1. My network predicts lixel-based 1D heatmaps in a fully convolutional way. The 1D heatmaps of x- and y-axis are defined in image space.
Q2. Different joint set during the evaluation A2. I followed evaluation protocols of previous works. They used h36m joint set for the evaluation.
Ah, alright got it. Thank you so much for the answers!
Hi,
Thank you for the good work. I have a small doubt regarding the ground truth annotations. What is the difference between targets['orig_joint_img'] and targets['orig_joint_cam']? You're using targets['orig_joint_cam'] for supervision during the parameter regression stage of the model. Why could targets['orig_joint_img'] not be used instead? Looking forward to your response.
Thanks in advance