Open3DVLab / NeuRodin

[NeurIPS'24] NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
https://open3dvlab.github.io/NeuRodin/
Apache License 2.0
111 stars 5 forks source link

NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction

NeurIPS 2024

[**Project Page**](https://open3dvlab.github.io/NeuRodin/) | [**Arxiv**](https://arxiv.org/abs/2408.10178) [Yifan Wang1,2](https://github.com/yyfz), [Di Huang1](https://dihuang.me/), [Weicai Ye1,3](https://ywcmaike.github.io/), [Guofeng Zhang3](http://www.cad.zju.edu.cn/home/gfzhang/) [Wanli Ouyang1](https://wlouyang.github.io/), [Tong He1](http://tonghe90.github.io/) **1Shanghai AI Laboratory**    **2Shanghai Jiao Tong University**    **3State Key Lab of CAD&CG, Zhejiang University**

overview

This is the official implementation of paper "NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction".

Signed Distance Function (SDF)-based volume rendering has demonstrated significant capabilities in surface reconstruction. Although promising, SDF-based methods often fail to capture detailed geometric structures, resulting in visible defects. By comparing SDF-based volume rendering to density-based volume rendering, we identify two main factors within the SDF-based approach that degrade surface quality: SDF-to-density representation and geometric regularization. These factors introduce challenges that hinder the optimization of the SDF field. To address these issues, we introduce NeuRodin, a novel two-stage neural surface reconstruction framework that not only achieves high-fidelity surface reconstruction but also retains the flexible optimization characteristics of density-based methods. NeuRodin incorporates innovative strategies that facilitate transformation of arbitrary topologies and reduce artifacts associated with density bias. Extensive evaluations on the Tanks and Temples and ScanNet++ datasets demonstrate the superiority of NeuRodin, showing strong reconstruction capabilities for both indoor and outdoor environments using solely posed RGB captures. All codes and models will be made public upon acceptance.

Results

ballroom ballroom

Installation

Ensure that you have the following prerequisites installed on your system:

Create and Activate Conda Environment

First, create a new Conda environment named neurodin with Python 3.8:

conda create -n neurodin python==3.8 -y

Activate the newly created environment:

conda activate neurodin

Install PyTorch, TorchVision, and Torchaudio with CUDA 11.3 support (or with other CUDA versions and Torch versions. We tested 1.13.1+cu116, which also works):

pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113

Install tiny-cuda-nn Bindings for PyTorch:

pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch

Install SDFStudio and command-line interface tab completion:

pip install -e .
ns-install-cli

To install PyMCubes for extracting meshes, use the following command:

pip install PyMCubes==0.1.4

Data Preparation

Tanks and Temples

For the training set, please download the camera poses and image set from Tanks and Temples website. Follow the Neuralangelo data processing guide to preprocess the data.

The data should be arranged as:

- <data_root>
    - <scene_name>
        - images
        - transforms.json
        ...

For the advanced set, we directly use the data preprocessed by Vis-MVSNet.

The data should be arranged as:

- <data_root>
    - <scene_name>
        - cams
        - images

ScanNet++

Go to the ScanNet++ website and download the DSLR dataset. Use the script from sdfstudio to convert it to the sdfstudio format.

The data should be arranged as:

- <data_root>
    - <scene_name>
        - *.png
        - meta_data.json

Custom Data

For custom data, we suggest organizing it in the COLMAP format, following the instructions provided in the Neuralangelo data processing guide. Alternatively, you can write a custom dataparser script for NeRFStudio/SDFStudio based on your own data structure.

Running

We have configs for room-level indoor scene (ScanNet++), large scale indoor and outdoor scene (Tanks and Temples).

Example for Tanks and Temples

Training Set

Outdoor scene:

# Stage 1
ns-train neurodin-stage1-outdoor-large --experiment_name neurodin-Barn-stage1 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 tnt-data --data <path-to-tnt> --scene_name Barn

# Stage 2
ns-train neurodin-stage2-outdoor-large --experiment_name neurodin-Barn-stage2 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 --trainer.load_dir <path-to-stage1-checkpoints-dir> tnt-data --data <path-to-tnt> --scene_name Barn

Indoor scene:

# Stage 1
ns-train neurodin-stage1-indoor-large --experiment_name neurodin-Meetingroom-stage1 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 tnt-data --data <path-to-tnt> --scene_name Meetingroom

# Stage 2
ns-train neurodin-stage2-indoor-large --experiment_name neurodin-Meetingroom-stage2 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 --trainer.load_dir <path-to-stage1-checkpoints-dir> tnt-data --data <path-to-tnt> --scene_name Meetingroom

Advance Set

Indoor scene:

# Stage 1
ns-train neurodin-stage1-indoor-large --experiment_name neurodin-Ballroom-stage1 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 tnt-advance-data --data <path-to-tnt> --scene_name Ballroom

# Stage 2
ns-train neurodin-stage2-indoor-large --experiment_name neurodin-Ballroom-stage2 --pipeline.datamanager.eval_camera_res_scale_factor 0.5 --trainer.load_dir <path-to-stage1-checkpoints-dir> tnt-advance-data --data <path-to-tnt> --scene_name Ballroom

Example for ScanNet++

# Stage 1
ns-train neurodin-stage1-indoor-small --experiment_name neurodin-21d970d8de-stage1 --pipeline.datamanager.camera_res_scale_factor 0.5 sdfstudio-data --data data/21d970d8de --scale_factor 0.8 

# Stage 2
ns-train neurodin-stage2-indoor-small --experiment_name neurodin-21d970d8de-stage2 --trainer.load_dir <path-to-stage1-checkpoints-dir> --pipeline.datamanager.camera_res_scale_factor 0.5 sdfstudio-data --data data/21d970d8de --scale_factor 0.8

Evaluation

We recommend using zoo/extract_surface.py (adapted from Neuralangelo) to extract the mesh. This method is faster because it doesn't require loading all images as ns-extract-mesh in sdfstudio does.

python zoo/extract_surface.py --conf <path-to-config> --resolution 2048

Citation

If you find our work useful in your research, please consider citing:

@article{wang2024neurodin,
  title={NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction},
  author={Wang, Yifan and Huang, Di and Ye, Weicai and Zhang, Guofeng and Ouyang, Wanli and He, Tong},
  journal={arXiv preprint arXiv:2408.10178},
  year={2024}
}

Acknowledgement

This codebase is modified from SDFStudio, NeRFStudio and Neuralangelo. Thanks to all of these great projects.