This is the repositary for paper "StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks". [Project Website] [arXiv] [Supplementary Video]
We would like to thank the generous authors from MobileStereoNet, PSMNet, and GwcNet for contributing such a great codebase. Several components of our code (dataloader, model components, etc.) are based on their code.
The code is tested on following libraries. Libraries with other versions might also work, but not guaranteed:
StereoVoxelNet is trained and evaluated using DrivingStereo dataset. Please make sure to download their training and testing data and extract to any folder you prefer.
We collect a dataset for finetuning.
Our dataset is shared via Globus.
Globus might have some issues now, we've migrated it to Google Drive
Note: You can try any sample from DrivingStereo dataset. However, evaluating DrivingStereo-trained model with sample from other dataset (KITTI) might lead to downgraded performance.
Please enter the visualization
folder
cd scripts/net/visualization/
For comparsion with other three approaches, run
python visualize_compare.py
To visualize the hierarchical output, run
python visualize_hie.py
To obtain the same result in the paper, like this | Cost Volume | CD | IoU |
---|---|---|---|
48 Levels | 3.56 | 0.32 | |
24 Levels | 2.96 | 0.34 | |
12 Levels | 3.05 | 0.33 | |
Voxel (Ours) | 2.40 | 0.35 |
Under the ./net/
folder, after changing DATAPATH
, DATALIST
, and ckpt_path
, you can run
python test.py
To train a model from scratch
python train.py --dataset voxelds --datapath PATH_TO_DATASET \
--trainlist PATH_TO_FILENAMES/DS_train.txt \
--testlist PATH_TO_FILENAMES/DS_test.txt \
--epochs 20 --lrepochs "10,16:2" --logdir PATH_TO_LOGS/logs \
--batch_size 4 --test_batch_size 4 --summary_freq 50 --loader_workers 8 \
--cost_vol_type voxel --model Voxel2D
If you use this code, please cite this paper:
@inproceedings{li2023stereovoxelnet,
title = {StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks},
author = {Li, Hongyu and Li, Zhengang and Akmandor, Neset Unver and Jiang, Huaizu and Wang, Yanzhi and Padir, Taskin},
booktitle={2023 IEEE International Conference on Robotics and Automation (ICRA)},
year={2023}
}