akanazawa / hmr

Project page for End-to-end Recovery of Human Shape and Pose
Other
1.54k stars 395 forks source link

Predicted parameters of the weak perspective projection #142

Open longbowzhang opened 4 years ago

longbowzhang commented 4 years ago

Hi, @akanazawa sorry to bother you. I am confused w.r.t the predicted parameters of the weak perspective projection.

  1. As you mentioned that scale s that HMR recovers is essentially _focallength/z, but the following line https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L247 suggests that _0.5 * imgsize comes into play, why?

  2. This line code https://github.com/akanazawa/hmr/blob/bce0ef9b90bd36871d2aff8688b2682170cd365a/src/util/renderer.py#L249 suggests that verts and trans, which is trans = np.hstack([cam_pos, tz]), are in the some but what space?

Thus, could you elaborate a little bit on the parameters of this weak perspective projection?

Thanks in advance.

jszgz commented 3 years ago

Hello, do you know how to use mpi_inf_3dhp_to_tfrecords.py to convert mpi_inf_3dhp dataset? I failed because the code use jpg as input but the dataset I downloaded is consisting of videos. Do I need to use ffmpeg and write code to convert avi to jpg?

nnop commented 3 weeks ago

In case some is coming to this issue. For the 1st question. The keypoints is normalized to [-1, 1] in data preprocessing. https://github.com/akanazawa/hmr/blob/f149abeb0a7e2a3412eb68274a94a9232f7cb667/src/data_loader.py#L320-L325 So the predicted s should be rescaled by 0.5 * img_size for the original image. That makes tz = f / (0.5 * img_size * cam_s). This is a suttle detail.

For the 2nd question, it's in the camera frame which is not consistent with the paper's equation.