JCZ404 / Semi-DETR

[CVPR 2023] Official implementation of the paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers"
https://arxiv.org/abs/2307.08095
MIT License
79 stars 9 forks source link

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

This repo is the official implementation of CVPR'2023 paper "Semi-DETR: Semi-Supervised Object Detection with Detection Transformers". Semi-DETR is the first work on semi-supervised object detection designed for detection transformers.

Update

Usage

Our code is based on the awesome codebase provided by Soft-Teacher[1].

Requirements

Installation

Ths project is developed based on mmdetection, please install the mmdet in a editable mode first:

cd thirdparty/mmdetection && python -m pip install -e .

Following the mmdetection, we also develop our detection transformer module and semi-supervised module in the similar way, which needs to be installed first(Please change the module name('detr_od' and 'detr_ssod') in 'setup.py' file alter):

cd ../../ && python -m pip install -e .

These will install 'mmdet', 'detr_od' and 'detr_ssod' in our environment. It also needs to compile the CUDA ops for deformable attention:

cd detr_od/models/utils/ops
python setup.py build install
# unit test (should see all checking is True)(Optional)
python test.py
cd ../../..

Data Preparation

For concrete instructions of what should be downloaded, please refer to `tools/dataset/prepare_coco_data.sh` line [`11-24`](https://github.com/microsoft/SoftTeacher/blob/863d90a3aa98615be3d156e7d305a22c2a5075f5/tools/dataset/prepare_coco_data.sh#L11). You can also download our generated semi-supervised data set splits in [semi-coco-splits](https://pan.baidu.com/s/1-b4D5ObCcg28TAp0iNr_cQ?pwd=wnsb).
- Download the PASCAL VOC dataset
- Execute the following command to generate data set splits:
```shell script
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar -xf VOCtrainval_06-Nov-2007.tar
tar -xf VOCtest_06-Nov-2007.tar
tar -xf VOCtrainval_11-May-2012.tar

# resulting format
# YOUR_DATA/
#   - VOCdevkit
#     - VOC2007
#       - Annotations
#       - JPEGImages
#       - ...
#     - VOC2012
#       - Annotations
#       - JPEGImages
#       - ...

Following prior works, we convert the PASCAL VOC dataset into COCO format and evaluate the performance of model with coco-style mAP. Execute the following command to convert the dataset format:

python scripts/voc_to_coco.py --devkit_path ${VOCdevkit-PATH} --out-dir ${VOCdevkit-PATH}

Training

We implement the DINO with mmdetection following the original official repo, if you want to train the fully supervised DINO model by youself and check our implementation, you can run:

sh tools/dist_train_detr_od.sh dino_detr 8

It would train the DINO with batch size 16 for 12 epochs. We also provide the resulted checkpoint dino_sup_12e_ckpt and our training log dino_sup_12e_log of this fully supervised model.

sh tools/dist_train_detr_ssod.sh dino_detr_ssod 1 10 8
sh tools/dist_train_detr_ssod_coco_full.sh <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

sh tools/dist_train_detr_ssod_coco_full.sh 8

Evaluation

python tools/test.py <CONFIG_FILE_PATH> <CHECKPOINT_PATH> --eval bbox

We also prepare some models trained by us bellow:

COCO:

Setting mAP Weights
1% Data 30.50 $\pm$ 0.30 ckpt
5% Data 40.10 $\pm$ 0.15 ckpt
10% Data 43.5 $\pm$ 0.10 ckpt
Full Data 50.5 ckpt

VOC:

Setting AP50 mAP Weights
Unlabel: VOC12 86.1 65.2 ckpt

[1] End-to-End Semi-Supervised Object Detection with Soft Teacher

Citation

If you find our repo useful for your research, please cite us:

@inproceedings{zhang2023semi,
  title={Semi-DETR: Semi-Supervised Object Detection With Detection Transformers},
  author={Zhang, Jiacheng and Lin, Xiangru and Zhang, Wei and Wang, Kuo and Tan, Xiao and Han, Junyu and Ding, Errui and Wang, Jingdong and Li, Guanbin},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23809--23818},
  year={2023}
}