autonomousvision / monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
MIT License
568 stars 52 forks source link

How to improve image quality by monosdf rendering #77

Open small-zeng opened 1 year ago

small-zeng commented 1 year ago

have a multi-room scenario, using 400 images for reconstruction with MonoSDF. The rendered new viewpoints only achieve a PSNR of 21. How can I improve this?

image image

small-zeng commented 1 year ago

Do I need to adjust the number of layers in the rendering network or the number of sampling points in the configuration file?

niujinshuchong commented 1 year ago

Hi, the mesh looks reasonable. Did you use per_image_code in your training?

small-zeng commented 1 year ago

Hi, the mesh looks reasonable. Did you use per_image_code in your training? Thank you for your response. I disabled the per_image_code during my training because it seemed ineffective; it was merely an input of image indices. However, my test set undergoes a separate process of random rendering. Would this have an impact on the results? What is the actual role of per_image_code and how does it function?

small-zeng commented 1 year ago
    if self.per_image_code:
        image_code = self.embeddings[indices].expand(rendering_input.shape[0], -1)
        rendering_input = torch.cat([rendering_input, image_code], dim=-1)

    x = rendering_input      

The "image_code" here seems to be just an index input, which would only improve the training views and not enhance the test views.

niujinshuchong commented 1 year ago

Hi, the per-image-code is proposed in nerf-in-the-wild paper can could model large appearance variance. It's true that it can't improve over test view since we don't have the per-image-code for the test views.

small-zeng commented 1 year ago

Hi, the per-image-code is proposed in nerf-in-the-wild paper can could model large appearance variance. It's true that it can't improve over test view since we don't have the per-image-code for the test views.

Thank you, are there limitations when using larger multi-room scenes, such as network forgetting issues? How should we go about solving this issue?

niujinshuchong commented 1 year ago

Hi, in this repo, we sample rays from a single image at each iteration since we use monocular depth loss and the rays should come from the same image. If the scene is big, the model might have forgetting issues. Might be better to adapt it to using rays from multiple images e.g. 16.