This repository contains the code and data for the WACV23 paper "Automatically Annotating Indoor Images with CAD Models via RGB-D Scans".
CAD model and pose annotations for the ScanNet dataset are available here. Annotations are automatically generated using scannotate and HOC-Search. The quality of these annotations was verified in several verification passes, with manual re-annotations performed for outliers to ensure that final annotations are of high quality.
Note: We tested the code using PyTorch v1.7.1, PyTorch3D v0.6.2 and Cuda 10.1. The following installation guide is customized to these specific versions. You may have to install different versions according to your system specifications. For general information about how to install PyTorch3D see the official installation guide.
The runtime dependencies can be installed by running:
conda create -n scannotate python=3.9
conda activate scannotate
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch -c nvidia -c conda-forge
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
For the CUB build time dependency, which you only need if you have CUDA older than 11.7, run:
conda install -c bottler nvidiacub
After installing the above dependencies, run the following commands:
pip install scikit-image matplotlib imageio plotly opencv-python open3d trimesh==3.10.2
conda install pytorch3d==0.6.2 -c pytorch3d
The corresponding environment file can be found at environment.yml
.
Download the ScanNet example here. Extract
the folders extracted, preprocessed, scans
and copy them to /data/ScanNet
. Note that by downloading the
example you agree to the
ScanNet Terms of Use.
This data example additionally contains the already preprocessed input scan, e.g. 3D bounding box and instance segmentation for the target objects as well as the 3D scan transformed into the PyTorch3D coordinate system.
Download the ShapenetV2 dataset by signing up
on the website. Extract ShapeNetCore.v2.zip to /data/ShapeNet
.
To center and scale-normalize the downloaded ShapeNet CAD models, run:
bash run_shapenet_prepro.sh gpu=0
The gpu
argument specifies which GPU should be used for processing.
By default, code is executed on CPU.
After the above-mentioned steps the /data
folder should contain the following directories:
- data
- ScanNet
- extracted
- preprocessed
- scans
- ShapeNet
- ShapeNet_preprocessed
- ShapeNetCore.v2
Our pipeline for automatic CAD model retrieval consists of three steps. Results after each step will be saved
to /results
.
Note that we use PyTorch3D as rendering pipeline, hence all 3D data are transformed into the PyTorch3D coordinate system. Information about this coordinate system can be found here.
The configuration file is a simple text file in .ini
format.
Default values for configuration parameters are available in /config
.
Note that these are just an indication of what a "reasonable" value for each parameter could be,
and are not meant as a way to reproduce any of the results from our paper.
Run CAD model retrieval with:
bash run_cad_retrieval.sh config=ScanNet.ini gpu=0
The results will be written to /results/ScanNet/$scene_name/retrieval
. Results contain the top5 retrieved
CAD models for each target object, as well as the combined top1 results for all target objects.
Additionally, the scene mesh without target
objects is written to /results/ScanNet/$scene_name
, which might be beneficial for visualization.
Run CAD model clustering and cloning with:
bash run_cad_similarity.sh config=ScanNet.ini gpu=0
Results after CAD model clustering and cloning will be written to /results/ScanNet/$scene_name/similarity
.
Run 9DOF differentiable pose refinement with:
bash run_cad_pose_refine.sh config=ScanNet.ini gpu=0
Final results after 9DOF pose refinement will be written to /results/ScanNet/$scene_name/refinement
.
If you found this work useful for your publication, please consider citing us:
@inproceedings{ainetter2023automatically,
title={Automatically Annotating Indoor Images with CAD Models via RGB-D Scans},
author={Ainetter, Stefan and Stekovic, Sinisa and Fraundorfer, Friedrich and Lepetit, Vincent},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={3156--3164},
year={2023}
}