WenjiaWang0312 / Zolly

[ICCV2023 oral] Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh Reconstruction
84 stars 1 forks source link

Datasets annotation description #5

Closed pixelite1201 closed 6 months ago

pixelite1201 commented 6 months ago

Hello again,

Thanks for releasing annotations for different datasets. I am trying to project the SMPL model from HuMMan dataset to the image using the SMPL parameters and camera parameters provide in the train.npz file but the model doesn't align. Do you have a brief description of what the different parameters in .npz files corresponds to? Will be really helpful.

Thanks

WenjiaWang0312 commented 6 months ago

Hi there, I think I have arranged all the datasets in the same format. You can project the SMPL joints or verts to the image by camera intrinsic matrix K and remember to add the transl to the body. Can you show the demo code? Or I can give you a demo code recent days.

pixelite1201 commented 6 months ago
    img_width, img_height = data['image_size'][i]
    image_path = os.path.join(base_img_dir, img_name)
    CAM_INT =  data['cam_params'].item()['K'][i]
    global_orient = data['smpl'].item()['global_orient'][i]
    body_pose = data['smpl'].item()['body_pose'][i]
    pose = np.concatenate((global_orient, body_pose), -1)
    beta = data['smpl'].item()['betas'][i]
    body_transl = data['smpl'].item()['transl'][i]
    cam_t = data['cam_params'].item()['T'][i]
    cam_rotmat = data['cam_params'].item()['R'][i]
    vertices3d, joints3d_ = get_smpl_vertices(pose, beta, body_transl, 'neutral', openpose=True)
    visualize(image_path, torch.tensor(vertices3d) ,  CAM_INT[0][0], smpl_model_neutral.faces)

This is how I read the data. I assumed vertices3d are in camera space and could be directly projected using intrinsic matrix. I also experimented with cam_t but with no success. The result for the first image in train.npz that I get look like this. It seems there is some problem with translation. Let me know if I understood something incorrectly.

Screenshot 2024-02-22 172252

Fyi, the same code works with MTP and pdhuman although with pdhuman I found that the alignment in certain cases is wrong like the following image, though I didn't find any camera rotation matrix in pdhuman train.npz file.

Screenshot 2024-02-22 172845
WenjiaWang0312 commented 6 months ago

Hi, there. I have already transformed the 6dof of the human body, which means the global orient and transl parameters are already in the camera space, so you won't need the camera's extrinsic. The first image you can overlay is captured by the num0 camera, the camera center is just the origin point of the world coordinate in HuMMAN dataset. So you should just drop the R and T.

PDHuman is not perfectly annotated since the human mesh is from RenderPeople. We need to retarget it to SMPL kinematics. We will release a better dataset based on Synbody soon.

pixelite1201 commented 6 months ago

Though i read the camera extrinsics from .npz, i didn't use them while projection. The result that I showed is by getting the vertices from SMPL forward pass and then projecting using camera intrinsics.

WenjiaWang0312 commented 6 months ago

Sorry, I will check it and reply to you before the end of Saturday.

pixelite1201 commented 6 months ago

Just to help you with the debugging, I found that the actual img_width and img_height is 640x360 but in intrinsic matrix the cx and cy are 959 and 553.

pixelite1201 commented 6 months ago

Hello, Did you figure out the problem?

WenjiaWang0312 commented 6 months ago

That's weird since I visualized all the images but didn't find any mis-overlay. Are you using my provided annotation? I showed all the cx and cy and they are all close to half image width/height. Looks correct. image

I think you are using the original annotation. Actually I have resized all the images and the annotations.

WenjiaWang0312 commented 6 months ago

image image Different views could be correctly overlayed.

pixelite1201 commented 6 months ago

Oh, I was using data['cam_params'].item()['K'][i]. Seems the values in data['K'] are different than that. Maybe it will be better to explain different annotations provided in the file. Thanks for helping with the issue.