mmatl / pyrender

Easy-to-use glTF 2.0-compliant OpenGL renderer for visualization of 3D scenes.
http://pyrender.readthedocs.io/
MIT License
1.31k stars 225 forks source link

Converting camera matrices from OpenCV / Pytorch3D #228

Open benjiebob opened 2 years ago

benjiebob commented 2 years ago

Hi there, excellent library. I've been banging my head against the wall for a few days now with this problem so thought it might be a good time to beg for help! :)

pyrender_error

I have a camera_matrix and run OpenCV's cv2.solvePnP(points3d, points2d, camera_matrix, distCoeffs=None) to obtain extrinsics R, t. Using similar code to this, I can render my SMPL mesh vertices and it looks fine (see Fig A)

For debugging purposes, I've loaded these parameters into Pytorch3D, using their cameras_from_opencv_projection(R, t, camera_matrix, image_size), method and am able to correctly render the mesh (see Fig B)

Now is where the fun begins...

I've been trying for a few days to figure out how to render this mesh correctly using Pyrender. I've constructed a rendering function (similar to this one):

def render(image, vertices, faces, camera_pose, camera_matrix):
    material = pyrender.MetallicRoughnessMaterial(
        metallicFactor=0.2, alphaMode="OPAQUE", baseColorFactor=(0.8, 0.3, 0.3, 1.0)
    )

    mesh = trimesh.Trimesh(vertices, faces, process = False)

    mesh = pyrender.Mesh.from_trimesh(mesh, material=material)
    scene = pyrender.Scene(ambient_light=(0.5, 0.5, 0.5))

    scene.add(mesh, "mesh")

    fx, fy, cx, cy = camera_matrix[0,0], camera_matrix[1,1], camera_matrix[0,2], camera_matrix[1,2]

    camera = pyrender.IntrinsicsCamera(fx=fx, fy=fy, cx=cx, cy=cy)
    scene.add(camera, pose=camera_pose)

    # There is some lighting stuff here that I'll omit for space.

    color, rend_depth = renderer.render(scene, flags=pyrender.RenderFlags.RGBA)
    color = color.astype(np.float32) / 255.0
    valid_mask = (rend_depth > 0)[:, :, None]
    output_img = color[:, :, :3] * valid_mask + (1 - valid_mask) * image

    return output_img

Based on this answer, I've tried:

As you can see, the Pyrender mesh still doesn't line up with the others.

From here, I've tried a whole bunch of things including fiddling with the principle point, starting with the PyTorch3D matrix and rotating 180 through X (as suggested here) but nothing has worked.

I'd be super grateful if someone could help me solve this. In case helpful, see the following descriptions for the camera setup in each of the libraries:

Thanks! Ben

AndersonDaniel commented 1 year ago

A bit late to the party, but what worked for me was inverting rows 1 & 2 then inverting the whole camera_pose (which indeed transposes R like you tried, but it also corrects t for the new R).

In short -

camera_pose[[1, 2]] *= -1
camera_pose = np.linalg.inv(camera_pose)
oneThousand1000 commented 1 year ago

Hi! I also tried a few days to get the correct pyrender camera pose, and I find this problem is maybe caused by the flipped y and z axis in pyrender, so you should flip the translation and rotation angles on the y and z axis, I posted an example here: https://github.com/mmatl/pyrender/issues/249

conallwang commented 1 year ago

A bit late to the party, but what worked for me was inverting rows 1 & 2 then inverting the whole camera_pose (which indeed transposes R like you tried, but it also correct t for the new R).

In short -

camera_pose[[1, 2]] *= -1
camera_pose = np.linalg.inv(camera_pose)

Thanks! This solution also works for me.

Chris10M commented 4 months ago

Hi,

The transformation works. Can I please know the intuition behind the inversion of the camera_pose?

camera_pose[[1, 2]] *= -1
camera_pose = np.linalg.inv(camera_pose)
AndersonDaniel commented 4 months ago

Hey @Chris10M , I found this by a bit of trial and error, but the intuition generally is that OP mentioned transposing R due to pre- vs post- multiplication of the transformation matrix, and I felt like t ought to be adjusted to the transposed rotation matrix.

hua-zi commented 4 months ago

image

pyrender Creating Cameras

My guess is this: