Differentiable Blocks World

Differentiable Blocks World:
Qualitative 3D Decomposition by Rendering Primitives
Tom Monnier Jake Austin Angjoo Kanazawa Alexei Efros Mathieu Aubry

![teaser.gif](./media/teaser.gif)

Official PyTorch implementation of Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives (to appear in NeurIPS 2023). Check out our webpage for video results!

This repository contains:

scripts to download and load datasets
configs to optimize the models from scratch
evaluation pipelines to reproduce quantitative results
guidelines to run the model on a new scene

If you find this code useful, don't forget to star the repo :star: and cite the paper :point_down:

``` @inproceedings{monnier2023dbw, title={{Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives}}, author={Monnier, Tom and Austin, Jake and Kanazawa, Angjoo and Efros, Alexei A. and Aubry, Mathieu}, booktitle={{NeurIPS}}, year={2023}, } ```

Installation :construction_worker:

1. Create conda environment :wrench:

conda env create -f environment.yml
conda activate dbw

Optional live monitoring :chart_with_downwards_trend:

Some monitoring routines are implemented, you can use them by specifying your visdom port in the config file. You will need to install visdom from source beforehand: ``` git clone https://github.com/facebookresearch/visdom cd visdom && pip install -e . ```

Optional Nerfstudio dataloading :tractor:

If you want to load data processed by Nerfstudio (e.g., for a custom scene), you will need to install nerfstudio as described here. In general, executing the following lines should do the job: ``` pip install ninja==1.10.2.3 git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch pip install nerfstudio==0.1.15 ```

2. Download datasets :arrow_down:

bash scripts/download_data.sh

This command will download one of the following sets of scenes presented in the paper:

DTU: paper / dataset (1.86GB, pre-processing conventions come from IDR, big thanks to the team!)
BlendedMVS: paper / dataset (115MB, thanks to the VolSDF team for hosting the dataset)
Nerfstudio: paper / repo / dataset (2.67GB, images and Nerfacto models for the 2 scenes in the paper)

It may happen that gdown hangs, if so download the file manually and move it to the datasets folder.

How to use :rocket:

1. Run models from scratch :runner:

To launch a training from scratch, run:

cuda=gpu_id config=filename.yml tag=run_tag ./scripts/pipeline.sh

where gpu_id is a device id, filename.yml is a config in configs folder, run_tag is a tag for the experiment.

Results are saved at runs/${DATASET}/${DATE}_${run_tag} where DATASET is the dataset name specified in filename.yml and DATE is the current date in mmdd format.

Available configs :high_brightness:

- `dtu/*.yml` for each DTU scene - `bmvs/*.yml` for each BlendedMVS scene - `nerfstudio/*.yml` for each Nerfstudio scene *NB:* for running on Nerfstudio scenes, you need to install [nerfstudio](https://github.com/nerfstudio-project) library (see installation section)

Computational cost :moneybag:

The approximate optimization time is roughly 4 hours on a single GPU.

2. Reproduce quantitative results on DTU :bar_chart:

Our model is evaluated at the end of each run and scores are written in dtu_scores.tsv for the official Chamfer evaluation and final_scores.tsv for training losses, transparencies and image rendering metrics. To reproduce our results on a single DTU scene, run the following command which will launch 5 sequential runs with different seeds (the auto score is the one with minimal training loss):

cuda=gpu_id config=dtu/scanXX.yml tag=default_scanXX ./scripts/multi_pipeline.sh

Get numbers for EMS and MBF baselines :clipboard:

For completeness, we provide scripts for processing data and evaluating the following baselines: - [EMS](https://github.com/bmlklwx/EMS-superquadric_fitting): run `scripts/ems_pproc.sh`, then apply EMS using the official repo, then run `scripts/ems_eval.sh` to evaluate the 3D decomposition - [MBF](https://github.com/MichaelRamamonjisoa/MonteBoxFinder): run `scripts/mbf_pproc.sh`, then apply MBF using the official repo, then run `scripts/mbf_eval.sh` to evaluate the 3D decomposition Do not forget to update the path of the baseline repos in `src/utils/path.py`. Results will also be computed using the preprocessing step removing the ground from the 3D input.

3. Train on a custom scene :crystal_ball:

If you want to run our model on a custom scene, we recommend using Nerfstudio framework and guidelines to process your multi-views, obtain the cameras and check their quality by optimizing their default 3D model. The resulting data and output model should be moved to datasets/nerfstudio folder in the same format as the other Nerfstudio scenes (you can also use symlinks).

Then, you can add the model path in the custom Nerfstudio dataloader (src/datasets/nerfstudio.py), create a new config from one of our nerfstudio config and run the model. One thing that is specific to each scene is the initialization of R_world and T_world, which can be roughly estimated by visual comparisons in plotly or Blender using the pseudo ground-truth point cloud.

Further information :books:

If you like this project, check out related works from our group:

monniert / differentiable-blocksworld

readme

Differentiable Blocks World