autonomousvision / monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
MIT License
548 stars 49 forks source link

How to improve image quality by monosdf rendering #77

Open small-zeng opened 10 months ago

small-zeng commented 10 months ago

have a multi-room scenario, using 400 images for reconstruction with MonoSDF. The rendered new viewpoints only achieve a PSNR of 21. How can I improve this?

image image

small-zeng commented 10 months ago

Do I need to adjust the number of layers in the rendering network or the number of sampling points in the configuration file?

niujinshuchong commented 10 months ago

Hi, the mesh looks reasonable. Did you use per_image_code in your training?

small-zeng commented 10 months ago

Hi, the mesh looks reasonable. Did you use per_image_code in your training? Thank you for your response. I disabled the per_image_code during my training because it seemed ineffective; it was merely an input of image indices. However, my test set undergoes a separate process of random rendering. Would this have an impact on the results? What is the actual role of per_image_code and how does it function?

small-zeng commented 10 months ago
    if self.per_image_code:
        image_code = self.embeddings[indices].expand(rendering_input.shape[0], -1)
        rendering_input = torch.cat([rendering_input, image_code], dim=-1)

    x = rendering_input      

The "image_code" here seems to be just an index input, which would only improve the training views and not enhance the test views.

niujinshuchong commented 10 months ago

Hi, the per-image-code is proposed in nerf-in-the-wild paper can could model large appearance variance. It's true that it can't improve over test view since we don't have the per-image-code for the test views.

small-zeng commented 10 months ago

Hi, the per-image-code is proposed in nerf-in-the-wild paper can could model large appearance variance. It's true that it can't improve over test view since we don't have the per-image-code for the test views.

Thank you, are there limitations when using larger multi-room scenes, such as network forgetting issues? How should we go about solving this issue?

niujinshuchong commented 10 months ago

Hi, in this repo, we sample rays from a single image at each iteration since we use monocular depth loss and the rays should come from the same image. If the scene is big, the model might have forgetting issues. Might be better to adapt it to using rays from multiple images e.g. 16.