nianticlabs / ace

[CVPR 2023 - Highlight] Accelerated Coordinate Encoding (ACE): Learning to Relocalize in Minutes using RGB and Poses
https://nianticlabs.github.io/ace
Other
360 stars 34 forks source link

About _convert_cv_to_gl(pose) #31

Closed CherryXChen closed 9 months ago

CherryXChen commented 9 months ago

Hi, I wonder why convert pose in opencv convention to opengl via using this matrix.

 @staticmethod
    def _convert_cv_to_gl(pose):
        """
        Convert a pose from OpenCV to OpenGL convention (and vice versa).

        @param pose: 4x4 camera pose.
        @return: 4x4 camera pose.
        """
        gl_to_cv = np.array([[1, -1, -1, 1], [-1, 1, 1, -1], [-1, 1, 1, -1], [1, 1, 1, 1]])
        return gl_to_cv * pose

On other places, you just apply

        scene_coordinates[:, 1] = -scene_coordinates[:, 1]
        scene_coordinates[:, 2] = -scene_coordinates[:, 2]

Thanks!

ebrach commented 9 months ago

Hi,

the first function converts poses between the conventions, whereas the second code snippet converts 3D points between conventions.

Best, Eric

CherryXChen commented 9 months ago

Hi,

the first function converts poses between the conventions, whereas the second code snippet converts 3D points between conventions.

Best, Eric

Thank you! But I still feel confused since NeRFStudio transforms pose from OpenCV to OpenGL in your scene coordinate transformation style. Why you product a matrix like that. Is there any reference to learn? :)

            c2w[0:3, 1:3] *= -1
            if self.config.assume_colmap_world_coordinate_convention:
                # world coordinate transform: map colmap gravity guess (-y) to nerfstudio convention (+z)
                c2w = c2w[np.array([0, 2, 1, 3]), :]
                c2w[2, :] *= -1

@ebrach @tcavallari

ebrach commented 9 months ago

While the information you seek is available online, it can be difficult to find, so I attach I brief explanation here.

I think the procedure for points is relatively clear: To convert from OpenGL to OpenCV convention we flip the y and z axes. Now assume you have points in the OpenCV convention, but get a pose in OpenGL convention. This is what we do:

1) Convert points from OpenCV convention to OpenGL convention (flipping axes) 2) Apply the OpenGL pose 3) Convert points back from OpenGL convention to OpenCV convention (flipping axes again)

Instead of applying the flipping of axes to the points, it can be directly applied to the pose. This is what our code does. Note that axes have to be flipped twice. Firstly, columns 2&3 have to be flipped. Secondly, rows 2&3 have to be flipped. This is what that the conversion matrix in our code does. Note that the conversion matrix is applied via element-wise product, not matrix product.

image

Hope that helps! Eric