hongsukchoi / Pose2Mesh_RELEASE

Official Pytorch implementation of "Pose2Mesh: Graph Convolutional Network for 3D Human Pose and Mesh Recovery from a 2D Human Pose", ECCV 2020
MIT License
678 stars 69 forks source link

Visualization result on 3DPW dataset #43

Open Cakin-Kwong opened 2 years ago

Cakin-Kwong commented 2 years ago

Following #32, I'm tring to visualize the groud truth and test result on 3DPW using your render_mesh function. It seems even the ground truth mesh cannot totally fit the image well. I wonder if this is normal ? Since the ground truth is obtained using SMPLify-X. Or should I follow the pipeline in demo/run.py to let the project_net to learn the camera params?

        # get camera parameters
        project_net = models.project_net.get_model(crop_size=virtual_crop_size).cuda()
        joint_input = coco_joint_img
        out = optimize_cam_param(project_net, joint_input, crop_size=virtual_crop_size)

        # vis mesh
        color = colorsys.hsv_to_rgb(np.random.rand(), 0.5, 1.0)
        orig_img = render(out, orig_height, orig_width, orig_img, mesh_model.face, color)#s[idx])
        cv2.imwrite(output_path + f'{img_name[:-4]}_mesh_{idx}.png', orig_img)
def render_mesh(img, mesh, face, cam_param):
    # mesh
    mesh = trimesh.Trimesh(mesh, face)
    rot = trimesh.transformations.rotation_matrix(
  np.radians(180), [1, 0, 0])
    mesh.apply_transform(rot)
    material = pyrender.MetallicRoughnessMaterial(metallicFactor=0.0, alphaMode='OPAQUE', baseColorFactor=(1.0, 1.0, 0.9, 1.0))
    mesh = pyrender.Mesh.from_trimesh(mesh, material=material, smooth=False)
    scene = pyrender.Scene(ambient_light=(0.3, 0.3, 0.3))
    scene.add(mesh, 'mesh')

    focal, princpt = cam_param['focal'], cam_param['princpt']
    camera = pyrender.IntrinsicsCamera(fx=focal[0], fy=focal[1], cx=princpt[0], cy=princpt[1])
    scene.add(camera)

    # renderer
    renderer = pyrender.OffscreenRenderer(viewport_width=img.shape[1], viewport_height=img.shape[0], point_size=1.0)

    # light
    light = pyrender.DirectionalLight(color=[1.0, 1.0, 1.0], intensity=0.8)
    light_pose = np.eye(4)
    light_pose[:3, 3] = np.array([0, -1, 1])
    scene.add(light, pose=light_pose)
    light_pose[:3, 3] = np.array([0, 1, 1])
    scene.add(light, pose=light_pose)
    light_pose[:3, 3] = np.array([1, 1, 2])
    scene.add(light, pose=light_pose)

    # render
    rgb, depth = renderer.render(scene, flags=pyrender.RenderFlags.RGBA)
    rgb = rgb[:,:,:3].astype(np.float32)
    valid_mask = (depth > 0)[:,:,None]

    # save to image
    img = rgb * valid_mask + img * (1-valid_mask)
    return img

image_00001 groud truth result

image_00001 test result

hongsukchoi commented 2 years ago

It seems like a translation error. GT meshes sometimes do have subtle pose errors, but most of them are well aligned, especially for the translation.

Could you give your full visualization codes?

Cakin-Kwong commented 2 years ago

It seems like a translation error. GT meshes sometimes do have subtle pose errors, but most of them are well aligned, especially for the translation.

Could you give your full visualization codes?

Thanks for your reply. I visualize all the test GT mesh. Only some of the GT mesh(mostly from downtown_arguing_00) have translation errors and others are well aligned.