This is the official implementation of our manuscript Mix-Teaching: a general semi-supervised learning framework for monocular 3D object detection. The raw data of KITTI which consists of 48K temporal images is used as unlabeled data in all experiments. For more details, please see our paper.
The performance on KITTI validation set (3D) is as follows:
Models | 10% | 30% | 100% | ||||||
Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | |
MonoFlex | 5.76 | 4.67 | 3.54 | 15.58 | 11.03 | 8.93 | 23.64 | 17.51 | 14.83 |
Ours | 14.43 | 10.65 | 8.41 | 23.81 | 16.94 | 13.80 | 30.82 | 22.18 | 18.61 |
Abs. Imp. | +8.57 | +5.98 | +4.87 | +8.23 | +5.91 | +4.87 | +7.18 | +4.67 | +3.78 |
Please refer to Installation
Then run
pip install mmcv-full==1.2.5 mmdet==2.11.0
git clone https://github.com/open-mmlab/mmdetection3d && cd mmdetection3d && git checkout v0.9.0
cd ../ && pip install mmdetection3d/
Please first download the training set and organize it as following structure:
datasets
│──kitti
│ ├──ImageSets
│ ├──training <-- 7481 train data
│ │ ├──calib
│ │ ├──label_2
│ │ └──image_2
│ └──testing <-- empty directory to save raw data in official format
│ ├──calib
│ ├──image_2
│ └──ImageSets
└──raw_data <-- raw data in zip format
Download and transfer format for KITTI raw data.
cd datasets && mkdir raw_data
cd ../raw_data_tools && bash download_raw_data.sh ../datasets/raw_data
python convert_det_format.py --raw_data_root ../datasets/raw_data --kitti_root ../datasets/kitti
cd ../pseudo_labeling_tools && python generate_imageset.py --kitti_root ../datasets/kitti
Then run
python create_data.py --kitti_root ../datasets/kitti
Please refer to Training in supervised mode.
Please refer to Inference.
Inference on unlabeled data and organize results as following structure:
pred_folders
│──model_1_preds
│ ├──000000.txt
│ ├──000001.txt
│ └── ...
│──model_2_preds
│ ├──000000.txt
│ ├──000001.txt
│ └── ...
└── ...
python uncertainty_estimator.py --kitti_root ../datasets/kitti --pred_folders <path-to-pred_folders>/
python create_data.py --kitti_root ../datasets/kitti --ssl True
python create_background_infos.py --kitti_root ../datasets/kitti
python parse_db_infos.py --old_db_infos ../datasets/kitti/kitti_dbinfos_test.pkl --new_db_infos ../datasets/kitti/kitti_dbinfos_test_filtered.pkl --score_threshold 0.7 --geo_conf_threshold 0.75
or
bash pseudo_labeling.sh
Please refer to Training in semi-supervised model.
If you find our work useful in your research, please consider citing:
@article{Yang2022MixTeachingAS,
title={Mix-Teaching: A Simple, Unified and Effective Semi-Supervised Learning Framework for Monocular 3D Object Detection},
author={Lei Yang and Xinyu Zhang and Li Wang and Minghan Zhu and Chuan-Fang Zhang and Jun Li},
journal={ArXiv},
year={2022},
volume={abs/2207.04448}
}
Thank for the excellent cooperative perception codebases MonoFlex
Thank for the excellent perception datasets KITTI
If you have any problem with this code, please feel free to contact yanglei20@mails.tsinghua.edu.cn.