xiuqhou / Relation-DETR

[ECCV2024 Oral] Official implementation of the paper "Relation DETR: Exploring Explicit Position Relation Prior for Object Detection"
Apache License 2.0
134 stars 11 forks source link
attention detection detr eccv2024 object-detection relation-detr transformer transformers

Relation DETR

By Xiuquan Hou, Meiqin Liu, Senlin Zhang, Ping Wei, Badong Chen, Xuguang Lan.

This repo is the official implementation of Relation DETR: Exploring Explicit Position Relation Prior for Object Detection accepted to ECCV2024 (score 5444, oral presentation). [Arxiv paper link], [论文介绍], [代码讲解]

💖 If our Relation-DETR or SA-Det-100k dataset is helpful to your researches or projects, please star this repository. Thanks! 🤗

TODO

...Want more features? Open a Feature Request.

Update

SA-Det-100k

SA-Det-100k is a large-scale class-agnostic object detection dataset for Research Purposes only. The dataset is based on a subset of SA-1B (see LICENSE), and all objects belong to the same category objects. Because it contains a large number of scenarios but does not provide class-specific annotations, we believe it may be a good choice to pre-training models for a variety of downstream tasks with different categories. The dataset contains about 100k images, and each image is resized using opencv-python so that the larger one of their height and width is 1333, which is consistent with the data augmentation commonly used to train COCO. The dataset can be found in:

Model ZOO

COCO

Model Backbone Epoch Download mAP AP50 AP75 APS APM APL
Relation DETR ResNet50 12 config / checkpoint / log 51.7 69.1 56.3 36.1 55.6 66.1
Relation DETR Swin-L(IN-22K) 12 config / checkpoint 57.8 76.1 62.9 41.2 62.1 74.4
Relation DETR ResNet50 24 config / checkpoint / log 52.1 69.7 56.6 36.1 56.0 66.5
Relation DETR Swin-L(IN-22K) 24 config / checkpoint / log 58.1 76.4 63.5 41.8 63.0 73.5
Relation-DETR Focal-L(IN-22K) 4+24 config / o365_checkpoint / checkpoint 63.5 80.8 69.1 47.2 66.9 77.0

† means finetuned model on COCO after pretraining on Object365.

[Other DETR variants:] We integrate our position relation into existing DETR variants and generate enhanced versions of them. Note some of these weights are newly trained and may produce slightly different results from those reported in our paper. We mark these variants with ++ in the name to distinguish them from their original versions.

Model Backbone Epoch Download mAP AP50 AP75 APS APM APL
Deformable-DETR++ ResNet50 12 config 47.0 65.6 51.2 29.3 51.0 62.2
Dab-Def-DETR++ ResNet50 12 config / checkpoint 48.3 66.5 52.9 32.4 52.0 62.0
DN-Def-DETR++ ResNet50 12 config / checkpoint 47.3 65.6 51.4 29.9 50.8 62.1
DINO++ ResNet50 12 config / checkpoint 50.1 67.8 54.9 33.3 53.9 63.5

SA-Det-100k

Model Backbone Epoch Download mAP AP50 AP75 APS APM APL
DINO with VFL ResNet50 12 —— 43.7 52.0 47.7 5.8 43.0 61.5
Relation DETR ResNet50 12 config / checkpoint 45.0 53.1 48.9 6.0 44.4 62.9

Get started

1. Installation **We use the environment same as [Salience-DETR](https://arxiv.org/abs/2403.16131). You can skip the step if you have run Salience-DETR.** 1. Clone the repository: ```shell git clone https://github.com/xiuqhou/Relation-DETR cd Relation-DETR ``` 2. Install Pytorch and torchvision: ```shell conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch ``` 3. Install other requirements: ```shell pip install -r requirements.txt ```
2. Prepare datasets Download [COCO2017](https://cocodataset.org/) (and [SA-Det-100k](https://huggingface.co/datasets/xiuqhou/SA-Det-100k) optionally), put them in `data/` following the structure: ```shell data/ ├─coco/ │ ├── train2017/ │ ├── val2017/ │ └── annotations/ │ ├── instances_train2017.json │ └── instances_val2017.json │ └─sa_det_100k/ ├── train2017/ ├── val2017/ └── annotations/ ```
3. Evaluate pretrained models To evaluate a model with one or more GPUs, specify `CUDA_VISIBLE_DEVICES`, `dataset`, `model` and `checkpoint`. ```shell CUDA_VISIBLE_DEVICES= accelerate launch test.py --coco-path /path/to/coco --model-config /path/to/model.py --checkpoint /path/to/checkpoint.pth ``` For example, run the following shell to evaluate Relation-DETR with ResNet-50 (1x) on COCO, You can expect to get the final AP about 51.7. ```shell CUDA_VISIBLE_DEVICES=0 accelerate launch test.py \ --coco-path data/coco \ --model-config configs/relation_detr/relation_detr_resnet50_800_1333.py \ --checkpoint https://github.com/xiuqhou/Relation-DETR/releases/download/v1.0.0/relation_detr_resnet50_800_1333_coco_1x.pth ``` - To export results to a json file, specify `--result` with a file name ended with `.json`. - To visualize predictions, specify `--show-dir` with a folder name. You can change the visualization style through `--font-scale`, `--box-thick`, `--fill-alpha`, `--text-box-color`, `--text-font-color`, `--text-alpha` parameters.
4. Evaluate exported json results To evaluate a json results, specify `dataset` and `result`. The evaluation only needs CPU so you don't need to specify `CUDA_VISIBLE_DEVICES`. ```shell accelerate launch test.py --coco-path /path/to/coco --result /path/to/result.json ``` - To visualize predictions, specify `--show-dir` with a folder name. You can change the visualization style through `--font-scale`, `--box-thick`, `--fill-alpha`, `--text-box-color`, `--text-font-color`, `--text-alpha` parameters.
5. Train a model Use `CUDA_VISIBLE_DEVICES` to specify GPU/GPUs and run the following script to start training. If not specified, the script will use all available GPUs on the node to train. Before start training, modify parameters in [configs/train_config.py](configs/train_config.py). ```shell CUDA_VISIBLE_DEVICES=0 accelerate launch main.py # train with 1 GPU CUDA_VISIBLE_DEVICES=0,1 accelerate launch main.py # train with 2 GPUs ```
5. Benchmark a model To test the inference speed, memory cost and parameters of a model, use tools/benchmark_model.py. ```shell python tools/benchmark_model.py --model-config configs/relation_detr/relation_detr_resnet50_800_1333.py ```
6. Export an ONNX model For advanced users who want to deploy our model, we provide a script to export an ONNX file. ```shell python tools/pytorch2onnx.py \ --model-config /path/to/model.py \ --checkpoint /path/to/checkpoint.pth \ --save-file /path/to/save.onnx \ --simplify \ # use onnxsim to simplify the exported onnx file --verify # verify the error between onnx model and pytorch model ``` For inference using the ONNX file, see ONNXDetector in [tools/pytorch2onnx.py](tools/pytorch2onnx.py)

License

Relation-DETR is released under the Apache 2.0 license. Please see the LICENSE file for more information.

Bibtex

If you find our work helpful for your research, please consider citing:

@inproceedings{hou2024relation,
  title={Relation DETR: Exploring Explicit Position Relation Prior for Object Detection},
  author={Hou, Xiuquan and Liu, Meiqin and Zhang, Senlin and Wei, Ping and Chen, Badong and Lan, Xuguang},
  booktitle={European conference on computer vision},
  year={2024},
  organization={Springer}
}

@InProceedings{Hou_2024_CVPR,
    author    = {Hou, Xiuquan and Liu, Meiqin and Zhang, Senlin and Wei, Ping and Chen, Badong},
    title     = {Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {17574-17583}
}