This is the official repo for PyTorch implementation of paper "MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance".
We test our repo with a single Nvidia RTX 3090Ti. Please decrease the target batch size if GPU memory is smaller.
git clone https://github.com/yuqunw/monopatch_nerf.git
conda create -n monopatchnerf python=3.9
and conda activate monopatchnerf
.conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
pip install -r requirements.txt
pip install git+https://github.com/leejaeyong7/OmnidataModels
and pip install git+http://github.com/leejaeyong7/ADE20KSegmenter
. Note that we only provide an API, and all the models and weights are entirely attributed to the original authors.python scripts/preprocess_eth3d.py -i ${image_folder} -o ${output_folder} -s ${sparse_folder}
python scripts/train.py --data_path "${DATA_DIR}/${scene}" \
--output_path "${OUTPUT_DIR}" \
--experiment_name "${scene}"
The default setting uses all proposed components. Run python train.py -h
for more options and instructions.
Render all input views with checkpoints:
python scripts/eval.py --model_checkpoint_file "${OUTPUT_DIR}/${scene}/checkpoints/model/model_steps_${num_iters}.ckpt" \
--grid_checkpoint_file "${OUTPUT_DIR}/${scene}/checkpoints/grid/grid_steps_${num_iters}.ckpt" \
--data_path "${DATA_DIR}/${scene}/" \
--output_path "${OUTPUT_DIR}/${scene}/output" \
--full True
Fuse point clouds with input views' poses and depths:
python scripts/fusion.py --output_path "${OUTPUT_PATH}/${scene}/output" \
--min_views 2 \
--threshold 2.0
The fused point cloud is ${OUTPUT_PATH}/results/fused.ply
. We use a loose threshold and views for ETH3D scenes. However, if the scene is denser, then the min_views can be larger and fusion threshold can be smaller, e.g., --min_views=5
and --threshold=0.5
for Tanks and Temples scenes. Colmap sparse folder can be specified to accelerate the fusion for denser view, e.g., --sparse_path ${SPARSE_DIR}/${scene}/sparse
.
Install the point cloud evaluation program of ETH3D, download the ground truth point cloud, change the corresponding path eth3d_evaluation_bin
in scripts/report.py
, and run the evaluation for rendered RGB images and fused point clouds:
python scripts/report.py --input_path "${DATA_DIR}/${scene}" \
--output_path "${OUTPUT_PATH}/${scene}/output" \
--gt_path "${GT_DIR}/${scene}/dslr_scan_eval"
The results are in ${OUTPUT_PATH}/${scene}/output/results/restuls.json
, containing PSNR, SSIM, LPIPS for novel view synthesis, and F1, precision, and recall for point cloud evaluation.
If you find this project helpful for your research, please consider citing the following BibTeX entry.
@article{wu2024monopatchnerf,
title={MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance},
author={Wu, Yuqun and Lee, Jae Yong and Zou, Chuhang and Wang, Shenlong and Hoiem, Derek},
journal={arXiv preprint arXiv:2404.08252},
year={2024}
}
If you find the QFF representation helpful for your research, please consider citing the following BibTeX entry.
@article{lee2022qff,
title={Qff: Quantized fourier features for neural field representations},
author={Lee, Jae Yong and Wu, Yuqun and Zou, Chuhang and Wang, Shenlong and Hoiem, Derek},
journal={arXiv preprint arXiv:2212.00914},
year={2022}
}