PJLab-ADG / 3DTrans

An open-source codebase for exploring autonomous driving pre-training
Apache License 2.0
585 stars 72 forks source link
3d-da 3d-pretraining

arXiv arXiv arXiv arXiv arXiv GitHub issues PRs Welcome

3DTrans: An Open-source Codebase for Continuous Learning towards Autonomous Driving Task

3DTrans includes Transfer Learning Techniques and Scalable Pre-training Techniques for tackling the continuous learning issue on autonomous driving as follows. 1) We implement the Transfer Learning Techniques consisting of four functions:

2) We implement the Scalable Pre-training which can continuously enhance the model performance for the downstream tasks, as more pre-training data are fed into our pre-training network:


News :fire:

We expect this repository will inspire the research of 3D model generalization since it will push the limits of perceptual performance. :tokyo_tower:

Installation for 3DTrans

You may refer to INSTALL.md for the installation of 3DTrans.

Getting Started

Getting Started for ALL Settings * Please refer to [Readme for Datasets](docs/GETTING_STARTED_DB.md) to prepare the dataset and convert the data into the 3DTrans format. Besides, 3DTrans supports the reading and writing data from **Ceph Petrel-OSS**, please refer to [Readme for Datasets](docs/GETTING_STARTED_DB.md) for more details. * Please refer to [Readme for UDA](docs/GETTING_STARTED_UDA.md) for understanding the problem definition of UDA and performing the UDA adaptation process. * Please refer to [Readme for ADA](docs/GETTING_STARTED_ADA.md) for understanding the problem definition of ADA and performing the ADA adaptation process. * Please refer to [Readme for SSDA](docs/GETTING_STARTED_SSDA.md) for understanding the problem definition of SSDA and performing the SSDA adaptation process. * Please refer to [Readme for MDF](docs/GETTING_STARTED_MDF.md) for understanding the problem definition of MDF and performing the MDF joint-training process. * Please refer to [Readme for ReSimAD](docs/GETTING_STARTED_ReSim.md) for [ReSimAD implementation](https://arxiv.org/abs/2309.05527). * Please refer to [Readme for AD-PT Pre-training](docs/GETTING_STARTED_PRETRAIN.md) for starting the journey of 3D perception pre-training using AD-PT. * Please refer to [Readme for PointContrast Pre-training](docs/GETTING_STARTED_PRETRAIN_PC.md) for 3D perception pre-training using PointContrast.

Model Zoo

We could not provide the Waymo-related pretrained models due to Waymo Dataset License Agreement, but you could easily achieve similar performance by training with the corresponding configs.

Domain Transfer Results

UDA Results Here, we report the cross-dataset (Waymo-to-KITTI) adaptation results using the BEV/3D AP performance as the evaluation metric. Please refer to [Readme for UDA](docs/GETTING_STARTED_UDA.md) for experimental results of more cross-domain settings. * All LiDAR-based models are trained with 4 NVIDIA A100 GPUs and are available for download. * For Waymo dataset training, we train the model using 20% data. * The domain adaptation time is measured with 4 NVIDIA A100 GPUs and PyTorch 1.8.1. * Pre-SN represents that we perform the [SN (statistical normalization)](https://arxiv.org/abs/2005.08139) operation during the pre-training source-only model stage. * Post-SN represents that we perform the [SN (statistical normalization)](https://arxiv.org/abs/2005.08139) operation during the adaptation stage. | | training time | Adaptation | Car@R40 | download | |---------------------------------------------|----------:|:-------:|:-------:|:---------:| | [PointPillar](tools/cfgs/DA/waymo_kitti/source_only/pointpillar_1x_feat_3_vehi.yaml) |~7.1 hours| Source-only with SN | 74.98 / 49.31 | - | | [PointPillar](tools/cfgs/DA/waymo_kitti/pointpillar_1x_pre_SN_feat_3.yaml) |~0.6 hours| Pre-SN | 81.71 / 57.11 | [model-57M](https://drive.google.com/file/d/1tPx8N75sm_zWsZv3FrwtHXeBlorhf9nP/view?usp=share_link) | | [PV-RCNN](tools/cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml) | ~23 hours| Source-only with SN | 69.92 / 60.17 | - | | [PV-RCNN](tools/cfgs/DA/waymo_kitti/source_only/pvrcnn_feat_3_vehi.yaml) | ~23 hours| Source-only | 74.42 / 40.35 | - | | [PV-RCNN](tools/cfgs/DA/waymo_kitti/pvrcnn_pre_SN_feat_3.yaml) | ~3.5 hours| Pre-SN | 84.00 / 74.57 | [model-156M](https://drive.google.com/file/d/1yt1JtBWyBtZjgE22HJUz6L7K6qeWiqM5/view?usp=share_link) | | [PV-RCNN](tools/cfgs/DA/waymo_kitti/pvrcnn_post_SN_feat_3.yaml) | ~1 hours| Post-SN | 84.94 / 75.20 | [model-156M](https://drive.google.com/file/d/1hd49JZ5amwP2gkblA8IHnH79ITapX2hF/view?usp=share_link) | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/source_only/voxel_rcnn_sn_kitti.yaml) | ~16 hours| Source-only with SN | 75.83 / 55.50 | - | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/source_only/voxel_rcnn_feat_3_vehi.yaml) | ~16 hours| Source-only | 64.88 / 19.90 | - | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/voxel_rcnn_pre_SN_feat_3.yaml) | ~2.5 hours| Pre-SN | 82.56 / 67.32 | [model-201M](https://drive.google.com/file/d/1_D7bnECL7bHL_4WOPhxAprHHmxwC8M7U/view?usp=share_link) | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/voxel_rcnn_post_SN_feat_3.yaml) | ~2.2 hours| Post-SN | 85.44 / 76.78 | [model-201M](https://drive.google.com/file/d/1v0U3Y9K6pe4JaOC5PIECq_wnR77Il-tl/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/DA/waymo_kitti/source_only/pv_rcnn_plus_sn_kitti.yaml) | ~20 hours| Source-only with SN | 67.22 / 56.50 | - | | [PV-RCNN++](tools/cfgs/DA/waymo_kitti/source_only/pv_rcnn_plus_feat_3_vehi_full_train.yaml) | ~20 hours| Source-only | 67.68 / 20.82 | - | | [PV-RCNN++](tools/cfgs/DA/waymo_kitti/pv_rcnn_plus_post_SN_feat_3.yaml) | ~2.2 hours| Post-SN | 86.86 / 79.86 | [model-193M](https://drive.google.com/file/d/1wDNC5kyg8BihV4zEgY2VntA2V_3jeL-5/view?usp=share_link) |
ADA Results Here, we report the Waymo-to-KITTI adaptation results using the BEV/3D AP performance. Please refer to [Readme for ADA](docs/GETTING_STARTED_ADA.md) for experimental results of more cross-domain settings. * All LiDAR-based models are trained with 4 NVIDIA A100 GPUs and are available for download. * For Waymo dataset training, we train the model using 20% data. * The domain adaptation time is measured with 4 NVIDIA A100 GPUs and PyTorch 1.8.1. | | training time | Adaptation | Car@R40 | download | | ------------------------------------------------------------------------------------ | ------------- | --------------------------- | ------- | -------- | | [PV-RCNN](tools/cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor.yaml) | ~23h@4 A100 | Source Only | 67.95 / 27.65 | - | | [PV-RCNN](tools/cfgs/ADA/waymo-kitti/pvrcnn/active_dual_target_01.yaml) | ~1.5h@2 A100 | Bi3D (1% annotation budget) | 87.12 / 78.03 | [Model-58M](https://drive.google.com/file/d/1zCpZRXQx3j_64HafplLpose4a6gDR6nS/view?usp=sharing) | | [PV-RCNN](tools/cfgs/ADA/waymo-kitti/pvrcnn/active_dual_target_05.yaml) | ~10h@2 A100 | Bi3D (5% annotation budget) | 89.53 / 81.32 | [Model-58M](https://drive.google.com/file/d/1hbso78eIXyYse8Hv1bvz5FLXkCzva7vb/view?usp=sharing) | | [PV-RCNN](tools/cfgs/ADA/waymo-kitti/pvrcnn/active_TQS.yaml) | ~1.5h@2 A100 | TQS | 82.00 / 72.04 | [Model-58M](https://drive.google.com/file/d/12rkTyCTtmQniZSuEcMC8w68f2bx3WjLK/view?usp=sharing) | | [PV-RCNN](tools/cfgs/ADA/waymo-kitti/pvrcnn/active_CLUE.yaml) | ~1.5h@2 A100 | CLUE | 82.13 / 73.14 | [Model-50M](https://drive.google.com/file/d/1kEiaskXkUMryBi7oSynr9PoCVZmzjdry/view?usp=sharing) | | [PV-RCNN](tools/cfgs/ADA/waymo-kitti/pvrcnn/active_st3d.yaml) | ~10h@2 A100 | Bi3D+ST3D | 87.83 / 81.23 | [Model-58M](https://drive.google.com/file/d/1MPL9l1iVCchuhv2wGW6mLqU8tOLJUb-e/view?usp=sharing) | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/source_only/voxel_rcnn_feat_3_vehi.yaml) | ~16h@4 A100 | Source Only | 64.87 / 19.90 | - | | [Voxel R-CNN](tools/cfgs/DA/waymo_kitti/source_only/pvrcnn_old_anchor_sn_kitti.yaml) | ~1.5h@2 A100 | Bi3D (1% annotation budget) | 88.09 / 79.14 | [Model-72M](https://drive.google.com/file/d/1F9RlK8z-WtOEHN9RIZt9uuzk4p5PGXBw/view?usp=sharing) | | [Voxel R-CNN](tools/cfgs/ADA/waymo-kitti/voxelrcnn/active_dual_target_05.yaml) | ~6h@2 A100 | Bi3D (5% annotation budget) | 90.18 / 81.34 | [Model-72M](https://drive.google.com/file/d/1coUt-R9AatKxE_DrWfYw0Y-nBdlDmoBU/view?usp=sharing) | | [Voxel R-CNN](tools/cfgs/ADA/waymo-kitti/voxelrcnn/active_TQS.yaml) | ~1.5h@2 A100 | TQS | 78.26 / 67.11 | [Model-72M](https://drive.google.com/file/d/1ByIEVQ9rn8mSXoyE8yY4441LduNNkqB-/view?usp=sharing) | | [Voxel R-CNN](tools/cfgs/ADA/waymo-kitti/voxelrcnn/active_CLUE.yaml) | ~1.5h@2 A100 | CLUE | 81.93 / 70.89 | [Model-72M](https://drive.google.com/file/d/1wDlmR9rqHna7zQSOb5ktf3bB0S1xVO_e/view?usp=sharing) |
SSDA Results We report the target domain results on Waymo-to-nuScenes adaptation using the BEV/3D AP performance as the evaluation metric, and Waymo-to-ONCE adaptation using ONCE evaluation metric. Please refer to [Readme for SSDA](docs/GETTING_STARTED_SSDA.md) for experimental results of more cross-domain settings. * The domain adaptation time is measured with 4 NVIDIA A100 GPUs and PyTorch 1.8.1. * For Waymo dataset training, we train the model using 20% data. * second_5%_FT denotes that we use 5% nuScenes training data to fine-tune the Second model. * second_5%_SESS denotes that we utilize the [SESS: Self-Ensembling Semi-Supervised](https://arxiv.org/abs/1912.11803) method to adapt our baseline model. * second_5%_PS denotes that we fine-tune the source-only model to nuScenes datasets using 5% labeled data, and perform the pseudo-labeling process on the remaining 95% unlabeled nuScenes data. | | training time | Adaptation | Car@R40 | download | |---------------------------------------------|----------:|:-------:|:-------:|:---------:| | [Second](tools/cfgs/SSDA/waymo_nusc/source_only/second_feat_3_vehi.yaml) | ~11 hours| source-only(Waymo) | 27.85 / 16.43 | - | | [Second](tools/cfgs/SSDA/waymo_nusc/second/second_feat_3_vehi_05_finetune.yaml) | ~0.4 hours| second_5%_FT | 45.95 / 26.98 | [model-61M](https://drive.google.com/file/d/1JIVqpw2cAL8z6wZwoBeJny9-jhFsee_i/view?usp=share_link) | | [Second](tools/cfgs/SSDA/waymo_nusc/second/second_feat_3_vehi_05_sess.yaml) | ~1.8 hours| second_5%_SESS | 47.77 / 28.74 | [model-61M](https://drive.google.com/file/d/15kRtg2Cq-cLtMzvm2urENBYw11knjQzA/view?usp=share_link) | | [Second](tools/cfgs/SSDA/waymo_nusc/second/second_feat_3_vehi_05_ps.yaml) | ~1.7 hours| second_5%_PS | 47.72 / 29.37 | [model-61M](https://drive.google.com/file/d/1MMOEuKyRhymHQwEk8-ow78sXE_n9-iRv/view?usp=share_link) | | [PV-RCNN](tools/cfgs/SSDA/waymo_nusc/source_only/pvrcnn_feat_3_vehi.yaml) | ~24 hours| source-only(Waymo) | 40.31 / 23.32 | - | | [PV-RCNN](tools/cfgs/SSDA/waymo_nusc/pvrcnn/pvrcnn_feat_3_vehi_05_finetune.yaml) | ~1.0 hours| pvrcnn_5%_FT | 49.58 / 34.86 | [model-150M](https://drive.google.com/file/d/19k8_DGDmwy93Rw9W1nJlGYehUm-nyB1D/view?usp=share_link) | | [PV-RCNN](tools/cfgs/SSDA/waymo_nusc/pvrcnn/pvrcnn_feat_3_vehi_05_sess.yaml) | ~5.5 hours| pvrcnn_5%_SESS | 49.92 / 35.28 | [model-150M](https://drive.google.com/file/d/1K8qZkLhAPjUTBzVbcHeh0Hb7To17ojN1/view?usp=share_link) | | [PV-RCNN](tools/cfgs/SSDA/waymo_nusc/pvrcnn/pvrcnn_feat_3_vehi_05_ps.yaml) | ~5.4 hours| pvrcnn_5%_PS | 49.84 / 35.07 | [model-150M](https://drive.google.com/file/d/1Hh7OQY2thhrxMCRxpr6Si8Utnf-yOvUy/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/SSDA/waymo_nusc/source_only/pvplus_feat_3_vehi.yaml) | ~16 hours| source-only(Waymo) | 31.96 / 19.81 | - | | [PV-RCNN++](tools/cfgs/SSDA/waymo_nusc/pvplus/pvplus_feat_3_vehi_05_finetune.yaml) | ~1.2 hours| pvplus_5%_FT | 49.94 / 34.28 | [model-185M](https://drive.google.com/file/d/1VTSic0I2T_k_Y-Tz64biMXDsj4N5vUF4/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/SSDA/waymo_nusc/pvplus/pvplus_feat_3_vehi_05_sess.yaml) | ~4.2 hours| pvplus_5%_SESS | 51.14 / 35.25 | [model-185M](https://drive.google.com/file/d/1lONnkK73dTj5CGNzIyssmkHKhsCZaNxS/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/SSDA/waymo_nusc/pvplus/pvplus_feat_3_vehi_05_ps.yaml) | ~3.6 hours| pvplus_5%_PS | 50.84 / 35.39 | [model-185M](https://drive.google.com/file/d/1wtV3OjkFXMPNHez9X4EPSFQAyBhYKei3/view?usp=share_link) | * For Waymo-to-ONCE adaptation, we employ 8 NVIDIA A100 GPUs for model training. * PS denotes that we pseudo-label the unlabeled ONCE and re-train the model on pseudo-labeled data. * SESS denotes that we utilize the [SESS](https://arxiv.org/abs/1912.11803) method to adapt the baseline. * For ONCE, the IoU thresholds for evaluation are 0.7, 0.3, 0.5 for Vehicle, Pedestrian, Cyclist. | | Training ONCE Data | Methods | Vehicle@AP | Pedestrian@AP | Cyclist@AP | download | |------------------------|---------------------------------:|:----------:|:----------:|:-------:|:-------:|:---------:| | [Centerpoint](tools/cfgs/once_models/sup_models/centerpoint.yaml) | Labeled (4K) | Train from scracth | 74.93 | 46.21 | 67.36 | [model-96M](https://drive.google.com/file/d/1KxgDaUpph72a18t0i9ceyXrkfvNWRWBE/view?usp=share_link) | | [Centerpoint_Pede](tools/cfgs/once_models/sup_models/centerpoint_pede_0075.yaml) | Labeled (4K) | PS | - | 49.14 | - | [model-96M](https://drive.google.com/file/d/19-LN7PkkpIMoBIqV8gydghrpkKJo9LS7/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/once_models/sup_models/pv_rcnn_plus_anchor_3CLS.yaml) | Labeled (4K) | Train from scracth | 79.78 | 35.91 | 63.18 | [model-188M](https://drive.google.com/file/d/187AomgxaRBTFpm3YqJ_UXp2Lg13t9OVs/view?usp=share_link) | | [PV-RCNN++](tools/cfgs/once_models/semi_learning_models/mt_pv_rcnn_plus_anchor_3CLS_small.yaml) | Small Dataset (100K) | SESS | 80.02 | 46.24 | 66.41 |[model-188M](https://drive.google.com/file/d/1hEPwnwZVKSmPDE-7XO45dFMoOTanD-n1/view?usp=share_link) |
MDF Results Here, we report the Waymo-and-nuScenes consolidation results. The models are jointly trained on Waymo and nuScenes datasets, and evaluated on Waymo using the mAP/mAHPH LEVEL_2 and nuScenes using the BEV/3D AP. Please refer to [Readme for MDF](docs/GETTING_STARTED_MDF.md) for more results. * All LiDAR-based models are trained with 8 NVIDIA A100 GPUs and are available for download. * The multi-domain dataset fusion (MDF) training time is measured with 8 NVIDIA A100 GPUs and PyTorch 1.8.1. * For Waymo dataset training, we train the model using 20% training data for saving training time. * PV-RCNN-nuScenes represents that we train the PV-RCNN model only using nuScenes dataset, and PV-RCNN-DM indicates that we merge the Waymo and nuScenes datasets and train on the merged dataset. Besides, PV-RCNN-DT denotes the domain attention-aware multi-dataset training. | Baseline | MDF Methods | Waymo@Vehicle | Waymo@Pedestrian | Waymo@Cyclist | nuScenes@Car | nuScenes@Pedestrian | nuScenes@Cyclist | |--------------------------|---------------------------:|:------------------:|:-------------:|:------------:|:------------:|:-------------:|:------------------:| | [PV-RCNN-nuScenes](./tools/cfgs/MDF/waymo_nusc/only_nusc/pvrcnn_feat_3_SWEEP_10_gt.yaml) | only nuScenes | 35.59 / 35.21 | 3.95 / 2.55 | 0.94 / 0.92 | 57.78 / 41.10 | 24.52 / 18.56 | 10.24 / 8.25 | | [PV-RCNN-Waymo](./tools/cfgs/MDF/waymo_nusc/only_waymo/pvrcnn_feat_3_3CLS_gt.yaml) | only Waymo | 66.49 / 66.01 | 64.09 / 58.06 | 62.09 / 61.02 | 32.99 / 17.55 | 3.34 / 1.94 | 0.02 / 0.01 | | [PV-RCNN-DM](./tools/cfgs/MDF/waymo_nusc/multi_db_pvrcnn_feat_3_merged.yaml) | Direct Merging | 57.82 / 57.40 | 48.24 / 42.81 | 54.63 / 53.64 | 48.67 / 30.43 | 12.66 / 8.12 | 1.67 / 1.04 | | [PV-RCNN-Uni3D](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_pvrcnn_feat_3_uni3d.yaml) | Uni3D | 66.98 / 66.50 | 65.70 / 59.14 | 61.49 / 60.43 | 60.77 / 42.66| 27.44 / 21.85 | 13.50 / 11.87 | | [PV-RCNN-DT](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_pvrcnn_feat_3_domain_attention.yaml) | Domain Attention | 67.27 / 66.77 | 65.86 / 59.38 | 61.38 / 60.34 | 60.83 / 43.03 | 27.46 / 22.06 | 13.82 / 11.52 | | Baseline | MDF Methods | Waymo@Vehicle | Waymo@Pedestrian | Waymo@Cyclist | nuScenes@Car | nuScenes@Pedestrian | nuScenes@Cyclist | |------------------------------|-----------:|:---------:|:-------:|:-------:|:----------:|:---------:|:------:| | [Voxel-RCNN-nuScenes](./tools/cfgs/MDF/waymo_nusc/only_nusc/voxel_rcnn_feat_3_SWEEP_10_gt.yaml) | only nuScenes | 31.89 / 31.65 | 3.74 / 2.57 |2.41 / 2.37 | 53.63 / 39.05 | 22.48 / 17.85 | 10.86 / 9.70 | | [Voxel-RCNN-Waymo](./tools/cfgs/MDF/waymo_nusc/only_waymo/voxel_rcnn_feat_3_3CLS_gt.yaml) | only Waymo | 67.05 / 66.41 | 66.75 / 60.83 | 63.13 / 62.15 | 34.10 / 17.31| 2.99 / 1.69 | 0.05 / 0.01 | | [Voxel-RCNN-DM](./tools/cfgs/MDF/waymo_nusc/multi_db_voxel_rcnn_feat_3_merged.yaml) | Direct Merging | 58.26 / 57.87 | 52.72 / 47.11 | 50.26 / 49.50 | 51.40 / 31.68 | 15.04 / 9.99 | 5.40 / 3.87 | | [Voxel-RCNN-Uni3D](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_voxel_rcnn_feat_3_uni3d.yaml) | Uni3D | 66.76 / 66.29 | 66.62 / 60.51 | 63.36 / 62.42 | 60.18 / 42.23 | 30.08 / 24.37 | 14.60 / 12.32 | | [Voxel-RCNN-DT](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_voxel_rcnn_feat_3_domain_attention.yaml) | Domain Attention | 66.96 / 66.50 | 68.23 / 62.00 | 62.57 / 61.64 | 60.42 / 42.81 | 30.49 / 24.92 | 15.91 / 13.35 | | Baseline | MDF Methods | Waymo@Vehicle | Waymo@Pedestrian | Waymo@Cyclist | nuScenes@Car | nuScenes@Pedestrian | nuScenes@Cyclist | |------------------------------|-----------:|:---------:|:-------:|:-------:|:----------:|:-------:|:------:| | [PV-RCNN++ DM](./tools/cfgs/MDF/waymo_nusc/multi_db_pvplus_feat_3_merged.yaml) | Direct Merging | 63.79 / 63.38 | 55.03 / 49.75 | 59.88 / 58.99 | 50.91 / 31.46 | 17.07 / 12.15 | 3.10 / 2.20 | | [PV-RCNN++-Uni3D](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_pvplus_feat_3_uni3d.yaml) | Uni3D | 68.55 / 68.08 | 69.83 / 63.60 | 64.90 / 63.91 | 62.51 / 44.16 | 33.82 / 27.18 | 22.48 / 19.30 | | [PV-RCNN++-DT](./tools/cfgs/MDF/waymo_nusc/waymo_nusc_pvplus_feat_3_domain_attention.yaml) | Domain Attention | 68.51 / 68.05 | 69.81 / 63.58 | 64.39 / 63.43 | 62.33 / 44.16 | 33.44 / 26.94 | 21.64 / 18.52 |

3D Pre-training Results

AD-PT Results on Waymo AD-PT demonstrates strong generalization learning ability on 3D points. We first pre-train the 3D backbone and 2D backbone using the [AD-PT](https://arxiv.org/abs/2306.00612) on ONCE dataset (from 100K to 1M data), and fine-tune the model on different datasets. Here, we report the results of fine-tuning on Waymo. | | Data amount | Overall | Vehicle | Pedestrian | Cyclist | | ------------------------------------------------------------------------------------ | ------------- | --------------------------- | ------- | -------- | -----| | [SECOND (From scratch)]() | 3% | 52.00 / 37.70 | 58.11 / 57.44 | 51.34 / 27.38 | 46.57 / 28.28 | | [SECOND (AD-PT)]() | 3% | **55.41** / **51.78** | 60.53 / 59.93 | 54.91 / 45.78 | 50.79 / 49.65 | | [SECOND (From scratch)]() | 20% | 60.62 / 56.86 | 64.26 / 63.73 | 59.72 / 50.38 | 57.87 / 56.48 | | [SECOND (AD-PT)]() | 20% | **61.26** / **57.69** | 64.54 / 64.00 | 60.25 / 51.21 | 59.00 / 57.86 | | [CenterPoint (From scratch)]() | 3% | 59.00 / 56.29 | 57.12 / 56.57 | 58.66 / 52.44 | 61.24 / 59.89 | | [CenterPoint (AD-PT)]() | 3% | **61.21** / **58.46** | 60.35 / 59.79 | 60.57 / 54.02 | 62.73 / 61.57 | | [CenterPoint (From scratch)]() | 20% | 66.47 / 64.01 | 64.91 / 64.42 | 66.03 / 60.34 | 68.49 / 67.28 | | [CenterPoint (AD-PT)]() | 20% | **67.17** / **64.65** | 65.33 / 64.83 | 67.16 / 61.20 | 69.39 / 68.25 | | [PV-RCNN++ (From scratch)]() | 3% | 63.81 / 61.10 | 64.42 / 63.93 | 64.33 / 57.79 | 62.69 / 61.59 | | [PV-RCNN++ (AD-PT)]() | 3% | **68.33** / **65.69** | 68.17 / 67.70 | 68.82 / 62.39 | 68.00 / 67.00 | | [PV-RCNN++ (From scratch)]() | 20% | 69.97 / 67.58 | 69.18 / 68.75 | 70.88 / 65.21 | 69.84 / 68.77 | | [PV-RCNN++ (AD-PT)]() | 20% | **71.55** / **69.23** | 70.62 / 70.19 | 72.36 / 66.82 | 71.69 / 70.70 |


ReSimAD Implementation Here, we give the [Download Link](docs/GETTING_STARTED_ReSim.md) of our reconstruction-simulation dataset by the [ReSimAD](https://arxiv.org/abs/2309.05527), consisting of nuScenes-like, KITTI-like, ONCE-like, and Waymo-like datasets that generate target-domain-like simulation points. Specifically, please refer to [ReSimAD reconstruction](https://longtimenohack.com/hosted/neuralsim_23Q1/waymo_meshes_exp1_20x20_sorted_ds%3D8_2160p.mp4) for the point-based reconstruction meshes, and [PCSim](https://github.com/PJLab-ADG/LiDARSimLib-and-Placement-Evaluation) for the technical details of simulating the target-domain-like points based on the reconstructed meshes. For perception module, please refer to [PV-RCNN](./tools/cfgs/ReSimAD/nuscenes/pvrcnn_nuScenes_ReSimAD.yaml) and [PV-RCNN++](./tools/cfgs/ReSimAD/nuscenes/pvrcnn_plus_nuScenes_ReSimAD.yaml) for model training and evaluation. We report the **zero-shot** cross-dataset (Waymo-to-nuScenes) adaptation results using the BEV/3D AP performance as the evaluation metric for a fair comparison. Please refer to [ReSimAD](./tools/cfgs/ReSimAD) for more details. | Methods | training time | Adaptation | Car@R40 | Ckpt | |---------------------------------------------|-------------:|:-----------:|:------------:|---------------:| [PV-RCNN](./tools/cfgs/DA/waymo_nusc/source_only/pvrcnn_old_anchor_nusc.yaml) | ~23 hours| Source-only | 31.02 / 17.75 | Not Avaliable (Waymo License) | [PV-RCNN](./tools/cfgs/DA/waymo_nusc/pvrcnn_st3d_feat_3.yaml) | ~8 hours| ST3D | 36.42 / 22.99 | - | [PV-RCNN](./tools/cfgs/ReSimAD/nuscenes/pvrcnn_nuScenes_ReSimAD.yaml) | ~8 hours| **ReSimAD** | 37.85 / 21.33 | [ReSimAD_ckpt](https://drive.google.com/file/d/18zMP2h11Xxl2fnDW_bWI9-FHb-9F6Nks/view?usp=sharing) | [PV-RCNN++](./tools/cfgs/DA/waymo_nusc/source_only/pv_rcnn_plus_feat_3_vehi.yaml) | ~20 hours| Source-only | 29.93 / 18.77 | Not Avaliable (Waymo License) | [PV-RCNN++](./tools/cfgs/DA/waymo_nusc/pv_rcnn_plus_st3d_feat_3.yaml) | ~2.2 hours| ST3D | 34.68 / 17.17 | - | [PV-RCNN++](./tools/cfgs/ReSimAD/nuscenes/pvrcnn_plus_nuScenes_ReSimAD.yaml) | ~8 hours| **ReSimAD** | 40.73 / 23.72 | [ReSimAD_ckpt](https://drive.google.com/file/d/1_tnp-Byu8a1_o78V1JUxmD_m6vuRfV3p/view?usp=sharing) |

Visualization Tools for 3DTrans

Visualization Demo - [Waymo Sequence-level Visualization Demo1](docs/seq_demo_waymo_bev.gif) - [Waymo Sequence-level Visualization Demo2](docs/seq_demo_waymo_fp.gif) - [nuScenes Sequence-level Visualization Demo](docs/seq_demo_nusc.gif) - [ONCE Sequence-level Visualization Demo](docs/seq_demo_once.gif)


Technical Papers

  title={Uni3D: A Unified Baseline for Multi-dataset 3D Object Detection},
  author={Zhang, Bo and Yuan, Jiakang and Shi, Botian and Chen, Tao and Li, Yikang and Qiao, Yu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  title={Bi3D: Bi-domain Active Learning for Cross-domain 3D Object Detection},
  author={Yuan, Jiakang and Zhang, Bo and Yan, Xiangchao and Chen, Tao and Shi, Botian and Li, Yikang and Qiao, Yu},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  title={AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset},
  author={Yuan, Jiakang and Zhang, Bo and Yan, Xiangchao and Chen, Tao and Shi, Botian and Li, Yikang and Qiao, Yu},
  booktitle={Advances in Neural Information Processing Systems},
  title={SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification},
  author={Huang, Siyuan and Zhang, Bo and Shi, Botian and Gao, Peng and Li, Yikang and Li, Hongsheng},
  booktitle={Proceedings of the 31th ACM International Conference on Multimedia},
  title={ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation},
  author={Zhang, Bo and Cai, Xinyu and Yuan, Jiakang and Yang, Donglin and Guo, Jianfei and Xia, Renqiu and Shi, Botian and Dou, Min and Chen, Tao and Liu, Si and others},
  journal={International Conference on Learning Representations},
  title={SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving},
  author={Yan, Xiangchao and Chen, Runjian and Zhang, Bo and Yuan, Jiakang and Cai, Xinyu and Shi, Botian and Shao, Wenqi and Yan, Junchi and Luo, Ping and Qiao, Yu},
  journal={arXiv preprint arXiv:2309.10527},