BraveGroup / FullySparseFusion

Fully Sparse Fusion for 3D Object Detection
MIT License
85 stars 4 forks source link

Fully Sparse Fusion for 3D Object Detection (TPAMI 2024)

arXiv

A multi-modal exploration on the paradigm of fully sparse 3D object detection


## Installation First initialize the conda environment ```shell conda create -n FSF python=3.8 -y conda activate FSF pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html ``` Then, install the mmdet3d ```shell pip install mmcv-full==1.3.9 pip install mmdet==2.14.0 pip install mmsegmentation==0.14.1 # modified mmdet3d git clone https://gitee.com/liyingyanUCAS/mmdetection3d.git cd mmdetection3d pip install -v -e . # some other packages mkdir pkgs && cd pkgs git clone https://github.com/Abyssaledge/TorchEx.git cd TorchEx && pip install -v -e . pip install spconv-cu114 pip install ipdb pip install torch-scatter==2.0.2 pip install nuscenes-devkit==1.1.3 pip install motmetrics pip install yapf==0.40.1 ``` ## Data Preparation First, make the data dir ```shell mkdir data ``` Then, please download the nuScenes and Argoverse 2 dataset and organize the data dir as follow: ``` ├── data | ├── nuscenes | | ├── samples │ │ │ ├── CAM_BACK │ │ │ ├── CAM_BACK_LEFT │ │ │ ├── CAM_BACK_RIGHT │ │ │ ├── CAM_FRONT │ │ │ ├── CAM_FRONT_LEFT │ │ │ ├── CAM_FRONT_RIGHT │ │ │ ├── LIDAT_TOP | | ├── sweeps │ │ │ ├── CAM_BACK │ │ │ ├── CAM_BACK_LEFT │ │ │ ├── CAM_BACK_RIGHT │ │ │ ├── CAM_FRONT │ │ │ ├── CAM_FRONT_LEFT │ │ │ ├── CAM_FRONT_RIGHT │ │ │ ├── LIDAT_TOP | | ├── v1.0-train | | ├── v1.0-val | | ├── v1.0-trainval | | ├── nuscenes_infos_train.pkl | | ├── nuscenes_infos_val.pkl | | ├── nuscenes_infos_trainval.pkl │ ├── argo2 │ │ │── argo2_format │ │ │ │ │──sensor │ │ │ │ │ │──train │ │ │ │ │ │ │──... │ │ │ │ │ │──val │ │ │ │ │ │ │──... │ │ │ │ │ │──test │ │ │ │ │ │ │──0c6e62d7-bdfa-3061-8d3d-03b13aa21f68 │ │ │ │ │ │ │──0f0cdd79-bc6c-35cd-9d99-7ae2fc7e165c │ │ │ │ │ │ │──... │ │ │ │ │ │──val_anno.feather │ │ │── kitti_format │ │ │ │ │──argo2_infos_train.pkl │ │ │ │ │──argo2_infos_val.pkl │ │ │ │ │──argo2_infos_test.pkl │ │ │ │ │──argo2_infos_trainval.pkl │ │ │ │ │──training │ │ │ │ │──testing │ │ │ │ │──argo2_gt_database ``` For the argo2 pickles, you can either use the pickles we provided or generate them by yourself. If you want to generate them by yourself, please run the following commands: ```shell python tools/AV2/argo2_pickle_mmdet_fusion.py ``` Please download the pretrained models and other files from [Google Drive](https://drive.google.com/drive/folders/1pylLxV6SEw2yf-ZZroKtCPAOKLg6nrUY?usp=drive_link). Then, please organize the ckpt dir as follow: ``` ├── ckpt | ├── fsd_argo_pretrain.pth | ├── fsd_nusc_pretrain.pth | ├── htc_x101_64x4d_fpn_dconv_c3-c5_coco-20e_16x1_20e_nuim_20201008_211222-0b16ac4b.pth ``` Then use our scripts for pre-infering and saving 2D mask ```shell ./tools/mask_tools/save_mask_nusc.sh ./tools/mask_tools/save_mask_argo2.sh ``` ## Train and Test ### nuScenes After the preparation, you can train our model with 8 GPUs on nuScenes using: ```shell ./tools/nusc_train.sh nuScenes/FSF_nuScenes_config 8 ``` For testing, please run the command: ```shell ./tools/dist_test.sh projects/configs/nuScenes/FSF_nuScenes_config.py $CKPT_PATH$ 8 ``` ### Argoverse 2 For training on Argoverse 2 with 8 GPUs, please using: ```shell ./tools/argo_train.sh Argoverse2/FSF_AV2_config 8 ``` For testing, please run: ```shell ./tools/dist_test.sh projects/configs/Argoverse2/FSF_AV2_config.py $CKPT_PATH$ 8 ``` ## Results | DATASET | mAP | NDS | CDS | |----------|------|------|-----| | nuScenes | 70.8 | 73.2 | - | | AV2 | 33.2 | - | 25.5| ## Citation Please consider citing our work as follows if it is helpful. ``` @article{li2024fully, title={Fully sparse fusion for 3d object detection}, author={Li, Yingyan and Fan, Lue and Liu, Yang and Huang, Zehao and Chen, Yuntao and Wang, Naiyan and Zhang, Zhaoxiang}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, year={2024}, publisher={IEEE} } ``` ## Acknowledgement This project is based on the following codebases. * [MMDetection3D](https://github.com/open-mmlab/mmdetection3d) * [FSD](https://github.com/tusen-ai/SST) (Ours!)