nianticlabs / simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions
Other
1.31k stars 121 forks source link

pointcloud fusion produces mess #9

Closed jaymefosa closed 2 years ago

jaymefosa commented 2 years ago

Running on the provided house scene with the provided pc_fusion flags doesn't produce the expected results:

image

mohammed-amr commented 2 years ago

Looks like this could be bad intrinsics loaded in, since code was tested and works for ScanNetv2. I'll take a look at it.

mohammed-amr commented 2 years ago

Hello hello, this is indeed an intrinsics problem.

Our pc_fusion.py script (incorrectly) assumes that intrinsics under K_full_depth_b44 are for 640 by 480 resolution. This is fine for ScanNetv2 but not a correct assumption for anything else. I've changed this to read and scale intrinsics correctly, and pushed a fix.

You should get an output that looks like this with dense frames:

snapshot02

and like this with default DVMVS keyframes:

snapshot03

I'd recommend default keyframes to save on time.

I'll let you close the issue as complete if satisfied.

jaymefosa commented 2 years ago

@mohammed-amr this looks fantastic! Thanks so much for taking a look at this

mohammed-amr commented 2 years ago

Thank you for bug hunting!

jaymefosa commented 2 years ago

Another hopefully quick question, the fix uses 640x480 (w,h) and the original size of VDR data seems to be 720x540, does the model receive a flag somewhere to change the depth pred output size?

mrharicot commented 2 years ago

The model was trained with 640x480 images, we either resize them offline using imagemagick as part of preprocessing, or we resize them on the fly in the dataloader: https://github.com/nianticlabs/simplerecon/blob/2e380aa88f01e13e5d97e3a6515b2ed590182e0b/utils/generic_utils.py#L204

mohammed-amr commented 2 years ago

Indeed, the vdr dataset class will attempt to load a color image in that size if precached:

    def get_color_filepath(self, scan_id, frame_id):
        """ returns the filepath for a frame's color file at the dataset's 
            configured RGB resolution.

            Args: 
                scan_id: the scan this file belongs to.
                frame_id: id for the frame.

            Returns:
                Either the filepath for a precached RGB file at the size 
                required, or if that doesn't exist, the full size RGB frame 
                from the dataset.

        """
        scene_path = os.path.join(self.dataset_path,
                                self.get_sub_folder_dir(self.split), scan_id)

        cached_resized_path = os.path.join(scene_path, 
                                    f"frame.{self.image_width}_{frame_id}.jpg")

        # check if we have cached resized images on disk first
        if os.path.exists(cached_resized_path):
            return cached_resized_path

        # instead return the default image
        return os.path.join(scene_path, 
                        f"frame_{frame_id}.jpg")

The parent class will resize if the image isn't the right size:

    def load_color(self, scan_id, frame_id):
        """ Loads a frame's RGB file, resizes it to configured RGB size.

            Args: 
                scan_id: the scan this file belongs to.
                frame_id: id for the frame.

            Returns:
                iamge: tensor of the resized RGB image at self.image_height and
                self.image_width resolution.

        """

        color_filepath = self.get_color_filepath(scan_id, frame_id)
        image = read_image_file(
                            color_filepath, 
                            height=self.image_height, width=self.image_width,
                            resampling_mode=self.image_resampling_mode,
                            disable_warning=self.disable_resize_warning,
                        )

        return image