nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.52k stars 1.3k forks source link

Bugreport of data parser #1039

Closed thekaka closed 1 year ago

thekaka commented 1 year ago

https://github.com/nerfstudio-project/nerfstudio/blob/a578ac8d1520c79ff0aff7c0e7d51d25feae2afa/nerfstudio/data/dataparsers/phototourism_dataparser.py#L99

image

"_id" in line 99 referred to the camera_id according to the load function of line 84. In line 99, the key of "imgs" is image_id. Is it a bug that use camera_id to query dict with key of image_id?

jake-austin commented 1 year ago

If I recall correctly, the keys should be the same for both, has this line caused an error for you?

thekaka commented 1 year ago

Yes. It is probably correct for colmap default process. However, I changed colmap pipeline, and an error happens.

I suggest iterate images instead of cameras, using image id to link the camera. I did it, and it works out.

jake-austin commented 1 year ago

@thekaka Can you give me the command you're running and the full stack trace so I can try and replicate this? I seemed to recall during testing that they had the same keys, I guess I remembered incorrectly then :/

jake-austin commented 1 year ago

Yes. It is probably correct for colmap default process. However, I changed colmap pipeline, and an error happens.

I suggest iterate images instead of cameras, using image id to link the camera. I did it, and it works out.

I can probably put up a quick PR for this, but any chance you can tell me what you've changed? Like did you change something in the colmap utilities file added for parsing the phototourism datasets?

thekaka commented 1 year ago

Yes. It is probably correct for colmap default process. However, I changed colmap pipeline, and an error happens. I suggest iterate images instead of cameras, using image id to link the camera. I did it, and it works out.

I can probably put up a quick PR for this, but any chance you can tell me what you've changed? Like did you change something in the colmap utilities file added for parsing the phototourism datasets?

Sure, here is the modification, please review~

        with CONSOLE.status(f"[bold green]Reading phototourism images and poses for {split} split...") as _:
            cams = read_cameras_binary(self.data / "dense/sparse/cameras.bin")
            imgs = read_images_binary(self.data / "dense/sparse/images.bin")

        poses = []
        fxs = []
        fys = []
        cxs = []
        cys = []
        image_filenames = []

        flip = torch.eye(3)
        flip[0, 0] = -1.0
        flip = flip.double()

        for image_id, img in imgs.items():  #change here
            if image_id not in select_imgs: #change here
                continue

            cam = cams[img.camera_id] #change here
            assert cam.model == "PINHOLE", "Only pinhole (perspective) camera model is supported at the moment"

            pose = torch.cat([torch.tensor(img.qvec2rotmat()), torch.tensor(img.tvec.reshape(3, 1))], dim=1)
            pose = torch.cat([pose, torch.tensor([[0.0, 0.0, 0.0, 1.0]])], dim=0)
            poses.append(torch.linalg.inv(pose))
            fxs.append(torch.tensor(cam.params[0]))
            fys.append(torch.tensor(cam.params[1]))
            cxs.append(torch.tensor(cam.params[2]))
            cys.append(torch.tensor(cam.params[3]))

            image_filenames.append(self.data / "dense/images" / img.name)
thekaka commented 1 year ago

There is an assumption that image to camera is one-to-one relation, but actually it is many to one, so I suggested that iterating images instead of cameras.

I tried the original codes with my own dataset, and it turns out the loaded images is less than real size.

jake-austin commented 1 year ago

Yeah that 100% makes sense that there will possibly be more images than cameras bc there's no guarantee that sfm can get a good camera pose for each image, but doesn't that mean it's a problem if we get our _id's from a set of images, since then when we iterate over our _id's, we might very well get a key error if no camera exists for that image's _id?

thekaka commented 1 year ago

cam = cams[img.camera_id] This line means getting the camera with img's attributes of camera_id instead of using image_id directly.

tancik commented 1 year ago

Closing as this looks stale. Feel free to reopen if needed.