ActiveVisionLab / nope-nerf

(CVPR 2023) NoPe-NeRF: Optimising Neural Radiance Field with No Pose Prior
https://nope-nerf.active.vision/
MIT License
372 stars 30 forks source link

questions about the intrinsic matrix (camera mat) and the coordinate transformation? #14

Open MagicTZ opened 1 year ago

MagicTZ commented 1 year ago

Hi, thanks for your work.

When running and processing data, I noticed that the format of the intrinsic matrix self.K is slightly different from the usual form of intrinsic matrices. I would like to ask what is the purpose of doing it this way.

        if customized_focal:
            focal_gt = np.load(os.path.join(load_dir, 'intrinsics.npz'))['K'].astype(np.float32)
            if resize_factor is None:
                resize_factor = 1
            fx = focal_gt[0, 0] / resize_factor
            fy = focal_gt[1, 1] / resize_factor
        else:
            fx, fy = focal, focal
        fx = fx / focal_crop_factor
        fy = fy / focal_crop_factor

        _, _, h, w = imgs.shape
        self.H, self.W, self.focal = h, w, fx

        self.K = np.array([[2*fx/w, 0, 0, 0], 
            [0, -2*fy/h, 0, 0],
            [0, 0, -1, 0],
            [0, 0, 0, 1]]).astype(np.float32)
        ids = np.arange(imgs.shape[0])

Also, there is one more question regarding coordinate systems: Is it necessary to normalize pixel coordinates to the range (-1, 1) when using such an intrinsic matrix?

For example, if I want to compute the relative pose between two images, I'm not sure which method is correct. The first approach involves normalizing the extracted image feature points to (-1, 1) and then using the aforementioned intrinsic matrix self.K to solve for the Essential Matrix and obtain R, T. The second approach is to directly use pixel coordinates with the usual intrinsic matrix (below), but does this result in a missing transformation or scale issue?

 usual_K = np.array([[fx, 0, w//2, 0], 
            [0, fy, h//2, 0],
            [0, 0, 1, 0],
            [0, 0, 0, 1]]
bianwenjing commented 1 year ago

Hi, you can write the intrinsic matrix in different ways depending on the range of pixel sampling. The intrinsic matrix we use is for mapping the points to (-1, 1). If you want to map points to (0, W) and (0, H), you can write the matrix as np.array([[fx, 0, w//2, 0], [0, -fy, h//2, 0], [0, 0, -1, 0], [0, 0, 0, 1]].