dcharatan / pixelsplat

[CVPR 2024 Oral, Best Paper Runner-Up] Code for "pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction" by David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann
http://davidcharatan.com/pixelsplat/
MIT License
864 stars 60 forks source link

Epipolar error on my own dataset #68

Closed Langwenchong closed 5 months ago

Langwenchong commented 5 months ago

image

Hello, I am trying to use your pre-trained model to reconstruct the n3dv dataset. I noticed the camera parameter instructions in your README.md file, so I converted the camera extrinsics and intrinsics from the n3dv dataset into the format required by your project. The camera parameters in the n3dv dataset are in a similar format to LLFF: image

so I extracted the 3x4 camera extrinsic matrix and appended a row [0, 0, 0, 1] at the bottom. For the intrinsics, I used the normalized forma:

if os.path.exists(camera_file):
    poses_bounds = np.load(camera_file)

    pose = np.array(poses_bounds[3]).astype(np.float32)

    near = pose[15]
    far = pose[16]
    data_matrix = np.zeros((3, 5))
    pose_slice = pose[:15]
    # print(pose_slice)
    for i, val in enumerate(pose_slice):
        row = i//5
        col = i % 5
        data_matrix[row, col] = val
    # print(data_matrix)

    R = data_matrix[:3, :3]
    t = data_matrix[:3, 3]

    extrinsics = np.zeros((4, 4), dtype=np.float32)
    extrinsics[:3, :3] = R
    extrinsics[:3, 3] = t
    extrinsics[3, 3] = 1.0

    print("Camera Extrinsics Matrix:")
    print(extrinsics)

    height = data_matrix[0, 4]
    width = data_matrix[1, 4]
    focal = data_matrix[2, 4]

    intrinsics = np.array([
        [focal/width, 0, 0.5],
        [0, focal/height, 0.5],
        [0, 0, 1]
    ], dtype=np.float32)
    # print(intrinsics)

    camera_params = {
        'extrinsics': extrinsics,
        'intrinsics': intrinsics,
        'near': near,
        'far': far
    }
    print(camera_params)
    np.savez(os.path.join(output_folder, 'camera_params.npz'), **camera_params)

However, I noticed that there seems to be a problem with the sampling positions after the encoder. Could you provide some possible strategies for troubleshooting this issue?Thanks!

dcharatan commented 5 months ago

Based on your epipolar line visualizations, I suspect that the poses aren't being converted correctly. You can adapt the code here to convert LLFF data to pixelSplat's format. Once the poses have been fixed, any highlighted point in the left image should be under the corresponding line in the right image.

Langwenchong commented 5 months ago

Thank you so much for your help!