News:
This repository contains an implementation of UniDet3D, a multi-dataset indoor 3D object detection method introduced in our paper:
UniDet3D: Multi-dataset Indoor 3D Object Detection
Maksim Kolodiazhnyi, Anna Vorontsova, Matvey Skripkin, Danila Rukhovich, Anton Konushin
Artificial Intelligence Research Institute
https://arxiv.org/abs/2409.04234
For convenience, we provide a Dockerfile.
This implementation is based on mmdetection3d framework v1.1.0
. If not using Docker, please follow getting_started.md for the installation instructions.
Please see test_train.md for some basic usage examples.
UniDet3D is trained and tested using 6 datasets: ScanNet, ARKitScenes, S3DIS, MultiScan, 3RScan, and ScanNet++. Preprocessed data can be found at our Hugging Face. Download each archive, unpack, and move into the corresponding directory in data. Please comply with the license agreement before downloading the data.
Alternatively, you can preprocess the data by youself. Training data for 3D object detection methods that do not requires superpoints, e.g. TR3D or FCAF3D, can be prepared according to the instructions.
Superpoints for ScanNet and MultiScan are provided as a part of the original annotation. For the rest datasets, you can either download pre-computed superpoints at our Hugging Face, or compute them using superpoint_transformer.
Before training, please download the backbone checkpoint and save it under work_dirs/tmp
.
To train UniDet3D on 6 datasets jointly, simply run the training script:
python tools/train.py configs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes.py
UniDet3D can also be trained on individual datasets, e.g., we provide a config for training using ScanNet solely.
To test a trained model, you can run the testing script:
python tools/test.py configs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes.py \
work_dirs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes/epoch_1024.pth
UniDet3D can also be tested on individual datasets. To this end, simply remove the unwanted datasets from val_dataloader.dataset.datasets
in the config file.
To visualize ground truth and predicted boxes, run the testing script with additional arguments:
python tools/test.py configs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes.py \
work_dirs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes/latest.pth --show \
--show-dir work_dirs/unidet3d_1xb8_scannet_s3dis_multiscan_3rscan_scannetpp_arkitscenes
You can also set score_thr
in configs to 0.3
for better visualizations.
Please refer to the UniDet3D checkpoint and log file. The corresponding metrics are given below (they might slightly deviate from the values reported in the paper due to the randomized training/testing procedure).
Dataset | mAP25 | mAP50 |
---|---|---|
ScanNet | 77.0 | 65.9 |
ARKitScenes | 60.1 | 47.2 |
S3DIS | 76.7 | 65.3 |
MultiScan | 62.6 | 52.3 |
3RScan | 63.6 | 44.9 |
ScanNet++ | 24.0 | 16.8 |
If you find this work useful for your research, please cite our paper:
@article{kolodiazhnyi2024unidet3d,
title={UniDet3D: Multi-dataset Indoor 3D Object Detection},
author={Kolodiazhnyi, Maxim and Vorontsova, Anna and Skripkin, Matvey and Rukhovich, Danila and Konushin, Anton},
journal={arXiv preprint arXiv:2409.04234},
year={2024}
}