Closed EAST-J closed 1 year ago
Hi! Thanks for the feedback.
annots.npy
. Other coordinates (including camera T stored in extri.yml
) in the dataset are all in meter.pose
parameter to transform a tpose SMPL to pose space (without any global translation or rotation), whereas in SMPL they store the global rotation parameters as the first row of the pose
parameter. We store the global rotation R
and translation T
separatedly from pose
.
This is because we use EasyMocap for mocap and what I just described is their convention (or you can say ours since we're from the same lab).
So, to use the vanialla smpl_layer(pose, shape, th)
formulation, you would need to convert our R
to the first row of pose
. Or simply transform the result of the vanialla smpl_layer(pose, shape, th)
with our R
to match the results in the vertices
folder.Thank you for your immediate reply.
'you would need to convert our R to the first row of pose', this means I only need to concatente the pose and Rh like:
new_pose = np.concatenate((pose, Rh), axis=-1)
or should I do something else to change the pose params?
R
comes in as a 3x3 rotation matrix. So you need to first convert the matrix notation to angle-axis (Rh
). Maybe consider cv2.Rodrigues
.
The first row of our pose
is filled with zero. So you need to fill the first row with Rh
instead of concatenating: pose[0] = Rh
R
comes in as a 3x3 rotation matrix. So you need to first convert the matrix notation to angle-axis (Rh
). Maybe considercv2.Rodrigues
. The first row of ourpose
is filled with zero. So you need to fill the first row withRh
instead of concatenating:pose[0] = Rh
Hi, sorry to bother you again. I am a little confused about the appearance code $l_i$. As stated in the paper, the latent code is used to encode the state of the human appearance in frame $i$. Inspired by DeepSDF, this embedding can be optimized during training following the autodecoder. But I wonder how $l_i$ works when inference?
For the original ICCV paper, we used the appearance code of the closest pose to the novel pose being rendered, which is sometimes simplified to the appearance code of the last training frame when rendering continuously (i.e. when we train on the first 60 frames and render the 60-120th frames).
In the extended version, we replace the appearance code with pose vector (trivially extensible to novel poses).
trivially extensible to novel poses
But when we train on the 60 frames like nn.Embedding(60, 128)
, and render next 30 frames. How to set the latent index?
@EAST-J We only use the pose vector as latent embedding in the extended version. By nn.Embedding
I believe you are referring to the original implementation. The use of these latent embedding (of the original implementation) is explained in my previous comment.
Sorry for the confusion about "the original paper" and "the original implementation". This repo contains implementation for both the original paper and the extended version.
Hi, Thanks for your great work, I have some questions about the dataset:
smpl_layer(pose, shape, th)
will I get the same result in the vertices folder?