Official code for 4D-StOP (ECCV 2022 AVVision Workshop)!
4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation
Lars Kreuzberg, Idil Esen Zulfikar, Sabarinath Mahadevan, Francis Engelmann and Bastian Leibe
ECCV 2022 AVVision Workshop | Paper
conda create --name <env> --file requirements.txt
cd cpp_wrappers
sh compile_wrappers.sh
cd pointnet2
python setup.py install
Download the SemanticKITTI dataset with labels from here.
Add the semantic-kitti.yaml file to the folder.
Create additional labels using utils/create_center_label.py
.
Folder structure:
SemanticKitti/
└── semantic-kitti.yaml
└── sequences/
└── 00/
└── calib.txt
└── poses.txt
└── times.txt
└── labels
├── 000000.label
├── 000000.center.npy
...
└── velodyne
├── 000000.bin
...
Use train_SemanticKitti.py
for training. Adapt the config parameters like you wish. Importantly, set the paths for the dataset-folder, checkpoints-folders etc. In the experiments in our paper, we first train the model for 800 epochs setting config.pre_train = True
. Then we train for further 300 epochs with config.pre_train = False
and config.freeze = True
. We train our models on a single NVIDIA A40 (48GB) GPU.
We provide an example script in jobscript_test.sh
. You need to adapt the paths here. It executes test_models.py
to generate the semantic and instance predictions within a 4D volume. In test_models.py
you need to set config parameters and choose the model you want to test. To track instances across 4D volumes, stitch_tracklets.py
is executed. To get the evaluation results utils/evaluate_4dpanoptic.py
is used. We test our models on a single NVIDIA TitanX (12GB) GPU.
You can find a trained model for 2-scan-setup here.
If you find our work useful in your research, please consider citing:
@inproceedings{kreuzberg2022stop,
title={4D-StOP: Panoptic Segmentation of 4D LiDAR using Spatio-temporal Object Proposal Generation and Aggregation},
author={Kreuzberg, Lars and Zulfikar, Idil Esen and Mahadevan,Sabarinath and Engelmann, Francis and Leibe, Bastian},
booktitle={European Conference on Computer Vision Workshop},
year={2022}
}
The code is based on the Pytoch implementation of 4D-PLS, KPConv and VoteNet.