alicevision / Meshroom

3D Reconstruction Software
http://alicevision.org
Other
11.22k stars 1.09k forks source link

Camera rotation and translation in Meshroom-2023 #2332

Closed yuliaar closed 2 weeks ago

yuliaar commented 8 months ago

The photogrammetry pipeline outputs cameras rotation and translation recovered from input images - either in cameras.sfm or their inverts in _KRt.txt files. Previously I have successfully used camera transforms recovered by Meshroom-2021 for 2D rendering with pyrender and similar openGL based renderers (it only required an additional 180 degrees rotation around x applied to Meshroom rotation matrix). Reading up on changes in Meshroom-2023 it looked like this version of Meshroom would align its coordinate frame with the openGL convention, so I expected that the only difference in how camera transforms should be applied would that the rotation around x does not need to be added during rendering. However, it is not like that at all, and I cannot figure out how to apply the camera transform produced by Meshroom 2023 to the mesh produced by Meshroom 2023 to achieve the correct 2D rendering.

In this issue, unrelated to my question, someone, however, says:

I think the reference frame convention has been changed between two releases, so if you used different versions, it could explain it. Maybe @servantftechnicolor has some better insight."

This is exactly what I would like to know, how the camera transform matrices have changed so I could amend my rendering code accordingly.

The render function below correctly works with texturedMesh.obj and _KRt.txt files produced by Meshroom 2021 but failes with files produced Meshroom 2023 (also fails without additional rotation Rx).

def render(pmesh,f,cx,cy,rotmat,tvec):

    # rotation by 180 around X
    Rx = np.array([[1, 0, 0],
                   [0, -1, 0],
                   [0, 0, -1]])        

    # homogeneous matrix 
    real_pose = np.eye(4,dtype=float)

    real_pose[:3,:3] = Rx @ rotmat
    real_pose[:3, 3] = Rx @ tvec

    real_pose = np.linalg.inv(real_pose)

    # setup cam
    real_cam = pyrender.IntrinsicsCamera(f,f,cx,cy)
    light = pyrender.PointLight(color=np.ones(3), intensity=10.0)
    scene = pyrender.Scene()
    obj_node = scene.add(pmesh)
    cam_node = scene.add(real_cam, pose=real_pose)
    point_l= scene.add(light, pose=real_pose)

    # render
    r = pyrender.OffscreenRenderer(viewport_width=cx*2, viewport_height=cy*2)
    color, depth = r.render(scene)

    return color

Desktop (please complete the following and other pertinent information):

yuliaar commented 1 month ago

Would be great to get any reply to this!

servantftechnicolor commented 1 month ago

Hello.

.sfm contains, historically, for each pose:

the associated homogeneous matrix avcamera_T_avworld is

avcamera_T_avworld = [avcamera_R_avworld -avworld_R_avcamera.transpose()avworld_t_avcamera; 0 1] avcamera_T_avworld = [avcamera_R_avworld -avcamera_T_avworld avworld_t_avcamera; 0 1]

but this is using the classical vision geometric system, which is X to the right, Z to the back, Y to the bottom.

The opengl convention is X to the right, Z to the front, Y to the top. so this is a 180 degrees rotation on axis X, or, in matrix form :

gl_T_av = [1 0 0 0; 0 -1 0 0; 0 0 -1 0; 0 0 0 1]

Now, the difference between 2021 and 2023 is that in 2021 .obj where exported in vision coordinates system and in 2023 the mesh is exported in graphics coordinates system.

So you have to multiply on both side the transformation now :

GLcamera_T_GLworld = gl_T_av avcamera_T_avworld av_T_gl GLcamera_T_GLworld = gl_T_av avcamera_T_avworld gl_T_av (symmetric matrix)

Tl/dr : Comparing with your code, i think the difference is that you have to post multiply your rotation with Rx ( Rx @ rotmat @ Rx)

yuliaar commented 2 weeks ago

Thanks, it has solved it.