yuqunw / monopatch_nerf

11 stars 1 forks source link

MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance

This is the official repo for PyTorch implementation of paper "MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance".

Paper | Project

Setup

Prerequest

We test our repo with a single Nvidia RTX 3090Ti. Please decrease the target batch size if GPU memory is smaller.

Environment

Dataset

Custom Dataset:

Usage

Training

python scripts/train.py --data_path "${DATA_DIR}/${scene}" \
                --output_path "${OUTPUT_DIR}" \
                --experiment_name "${scene}"

The default setting uses all proposed components. Run python train.py -h for more options and instructions.

Rendering

Render all input views with checkpoints:

python scripts/eval.py --model_checkpoint_file "${OUTPUT_DIR}/${scene}/checkpoints/model/model_steps_${num_iters}.ckpt" \
                       --grid_checkpoint_file "${OUTPUT_DIR}/${scene}/checkpoints/grid/grid_steps_${num_iters}.ckpt" \
                       --data_path "${DATA_DIR}/${scene}/" \
                       --output_path "${OUTPUT_DIR}/${scene}/output" \
                       --full True

Point Cloud Fusion

Fuse point clouds with input views' poses and depths:

python scripts/fusion.py --output_path "${OUTPUT_PATH}/${scene}/output" \
                         --min_views 2 \
                         --threshold 2.0

The fused point cloud is ${OUTPUT_PATH}/results/fused.ply. We use a loose threshold and views for ETH3D scenes. However, if the scene is denser, then the min_views can be larger and fusion threshold can be smaller, e.g., --min_views=5 and --threshold=0.5 for Tanks and Temples scenes. Colmap sparse folder can be specified to accelerate the fusion for denser view, e.g., --sparse_path ${SPARSE_DIR}/${scene}/sparse.

Evaluation

Install the point cloud evaluation program of ETH3D, download the ground truth point cloud, change the corresponding path eth3d_evaluation_bin in scripts/report.py, and run the evaluation for rendered RGB images and fused point clouds:

python scripts/report.py --input_path "${DATA_DIR}/${scene}" \
                         --output_path "${OUTPUT_PATH}/${scene}/output" \
                         --gt_path "${GT_DIR}/${scene}/dslr_scan_eval" 

The results are in ${OUTPUT_PATH}/${scene}/output/results/restuls.json, containing PSNR, SSIM, LPIPS for novel view synthesis, and F1, precision, and recall for point cloud evaluation.

Citation

If you find this project helpful for your research, please consider citing the following BibTeX entry.

@article{wu2024monopatchnerf,
  title={MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance},
  author={Wu, Yuqun and Lee, Jae Yong and Zou, Chuhang and Wang, Shenlong and Hoiem, Derek},
  journal={arXiv preprint arXiv:2404.08252},
  year={2024}
}

If you find the QFF representation helpful for your research, please consider citing the following BibTeX entry.

@article{lee2022qff,
  title={Qff: Quantized fourier features for neural field representations},
  author={Lee, Jae Yong and Wu, Yuqun and Zou, Chuhang and Wang, Shenlong and Hoiem, Derek},
  journal={arXiv preprint arXiv:2212.00914},
  year={2022}
}