IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
https://detrex.readthedocs.io/en/latest/
Apache License 2.0
1.9k stars 199 forks source link
anchor-detr conditional-detr dab-detr deformable-detr deta detr dino dn-detr group-detr h-detr mask-dino object-detection pose-estimation pytorch segmentation state-of-the-art

🦖detrex: Benchmarking Detection Transformers

release docs Documentation Status GitHub PRs Welcome open issues

[📚Read detrex Benchmarking Paper](https://arxiv.org/abs/2306.07265) | [🏠Project Page](https://rentainhe.github.io/projects/detrex/) | [🏷️Cite detrex](#citation) | [🚢DeepDataSpace](https://github.com/IDEA-Research/deepdataspace)
[📘Documentation](https://detrex.readthedocs.io/en/latest/index.html) | [🛠️Installation](https://detrex.readthedocs.io/en/latest/tutorials/Installation.html) | [👀Model Zoo](https://detrex.readthedocs.io/en/latest/tutorials/Model_Zoo.html) | [🚀Awesome DETR](https://github.com/IDEA-Research/awesome-detection-transformer) | [🆕News](#whats-new) | [🤔Reporting Issues](https://github.com/IDEA-Research/detrex/issues/new/choose)

Introduction

detrex is an open-source toolbox that provides state-of-the-art Transformer-based detection algorithms. It is built on top of Detectron2 and its module design is partially borrowed from MMDetection and DETR. Many thanks for their nicely organized code. The main branch works with Pytorch 1.10+ or higher (we recommend Pytorch 1.12).

Major Features - **Modular Design.** detrex decomposes the Transformer-based detection framework into various components which help users easily build their own customized models. - **Strong Baselines.** detrex provides a series of strong baselines for Transformer-based detection models. We have further boosted the model performance from **0.2 AP** to **1.1 AP** through optimizing hyper-parameters among most of the supported algorithms. - **Easy to Use.** detrex is designed to be **light-weight** and easy for users to use: - [LazyConfig System](https://detectron2.readthedocs.io/en/latest/tutorials/lazyconfigs.html) for more flexible syntax and cleaner config files. - Light-weight [training engine](./tools/train_net.py) modified from detectron2 [lazyconfig_train_net.py](https://github.com/facebookresearch/detectron2/blob/main/tools/lazyconfig_train_net.py) Apart from detrex, we also released a repo [Awesome Detection Transformer](https://github.com/IDEA-Research/awesome-detection-transformer) to present papers about Transformer for detection and segmentation.

Fun Facts

The repo name detrex has several interpretations:

What's New

v0.5.0 was released on 16/07/2023:

Please see changelog.md for details and release history.

Installation

Please refer to Installation Instructions for the details of installation.

Getting Started

Please refer to Getting Started with detrex for the basic usage of detrex. We also provides other tutorials for:

Although some of the tutorials are currently presented with relatively simple content, we will constantly improve our documentation to help users achieve a better user experience.

Documentation

Please see documentation for full API documentation and tutorials.

Model Zoo

Results and models are available in model zoo.

Supported methods - [x] [DETR (ECCV'2020)](./projects/detr/) - [x] [Deformable-DETR (ICLR'2021 Oral)](./projects/deformable_detr/) - [x] [PnP-DETR (ICCV'2021)](./projects/pnp_detr/) - [x] [Conditional-DETR (ICCV'2021)](./projects/conditional_detr/) - [x] [Anchor-DETR (AAAI 2022)](./projects/anchor_detr/) - [x] [DAB-DETR (ICLR'2022)](./projects/dab_detr/) - [x] [DAB-Deformable-DETR (ICLR'2022)](./projects/dab_deformable_detr/) - [x] [DN-DETR (CVPR'2022 Oral)](./projects/dn_detr/) - [x] [DN-Deformable-DETR (CVPR'2022 Oral)](./projects/dn_deformable_detr/) - [x] [Group-DETR (ICCV'2023)](./projects/group_detr/) - [x] [DETA (ArXiv'2022)](./projects/deta/) - [x] [DINO (ICLR'2023)](./projects/dino/) - [x] [H-Deformable-DETR (CVPR'2023)](./projects/h_deformable_detr/) - [x] [MaskDINO (CVPR'2023)](./projects/maskdino/) - [x] [CO-MOT (ArXiv'2023)](./projects/co_mot/) - [x] [SQR-DETR (CVPR'2023)](./projects/sqr_detr/) - [x] [Align-DETR (ArXiv'2023)](./projects/align_detr/) - [x] [EVA-01 (CVPR'2023 Highlight)](./projects/dino_eva/) - [x] [EVA-02 (ArXiv'2023)](./projects/dino_eva/) - [x] [Focus-DETR (ICCV'2023)](./projects/focus_detr/) Please see [projects](./projects/) for the details about projects that are built based on detrex.

License

This project is released under the Apache 2.0 license.

Acknowledgement

Citation

If you use this toolbox in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

@misc{ren2023detrex,
      title={detrex: Benchmarking Detection Transformers}, 
      author={Tianhe Ren and Shilong Liu and Feng Li and Hao Zhang and Ailing Zeng and Jie Yang and Xingyu Liao and Ding Jia and Hongyang Li and He Cao and Jianan Wang and Zhaoyang Zeng and Xianbiao Qi and Yuhui Yuan and Jianwei Yang and Lei Zhang},
      year={2023},
      eprint={2306.07265},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Citing Supported Algorithms ```BibTex @inproceedings{carion2020end, title={End-to-end object detection with transformers}, author={Carion, Nicolas and Massa, Francisco and Synnaeve, Gabriel and Usunier, Nicolas and Kirillov, Alexander and Zagoruyko, Sergey}, booktitle={European conference on computer vision}, pages={213--229}, year={2020}, organization={Springer} } @inproceedings{ zhu2021deformable, title={Deformable {\{}DETR{\}}: Deformable Transformers for End-to-End Object Detection}, author={Xizhou Zhu and Weijie Su and Lewei Lu and Bin Li and Xiaogang Wang and Jifeng Dai}, booktitle={International Conference on Learning Representations}, year={2021}, url={https://openreview.net/forum?id=gZ9hCDWe6ke} } @inproceedings{meng2021-CondDETR, title = {Conditional DETR for Fast Training Convergence}, author = {Meng, Depu and Chen, Xiaokang and Fan, Zejia and Zeng, Gang and Li, Houqiang and Yuan, Yuhui and Sun, Lei and Wang, Jingdong}, booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)}, year = {2021} } @inproceedings{ liu2022dabdetr, title={{DAB}-{DETR}: Dynamic Anchor Boxes are Better Queries for {DETR}}, author={Shilong Liu and Feng Li and Hao Zhang and Xiao Yang and Xianbiao Qi and Hang Su and Jun Zhu and Lei Zhang}, booktitle={International Conference on Learning Representations}, year={2022}, url={https://openreview.net/forum?id=oMI9PjOb9Jl} } @inproceedings{li2022dn, title={Dn-detr: Accelerate detr training by introducing query denoising}, author={Li, Feng and Zhang, Hao and Liu, Shilong and Guo, Jian and Ni, Lionel M and Zhang, Lei}, booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, pages={13619--13627}, year={2022} } @inproceedings{ zhang2023dino, title={{DINO}: {DETR} with Improved DeNoising Anchor Boxes for End-to-End Object Detection}, author={Hao Zhang and Feng Li and Shilong Liu and Lei Zhang and Hang Su and Jun Zhu and Lionel Ni and Heung-Yeung Shum}, booktitle={The Eleventh International Conference on Learning Representations }, year={2023}, url={https://openreview.net/forum?id=3mRwyG5one} } @InProceedings{Chen_2023_ICCV, author = {Chen, Qiang and Chen, Xiaokang and Wang, Jian and Zhang, Shan and Yao, Kun and Feng, Haocheng and Han, Junyu and Ding, Errui and Zeng, Gang and Wang, Jingdong}, title = {Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment}, booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, month = {October}, year = {2023}, pages = {6633-6642} } @InProceedings{Jia_2023_CVPR, author = {Jia, Ding and Yuan, Yuhui and He, Haodi and Wu, Xiaopei and Yu, Haojun and Lin, Weihong and Sun, Lei and Zhang, Chao and Hu, Han}, title = {DETRs With Hybrid Matching}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {19702-19712} } @InProceedings{Li_2023_CVPR, author = {Li, Feng and Zhang, Hao and Xu, Huaizhe and Liu, Shilong and Zhang, Lei and Ni, Lionel M. and Shum, Heung-Yeung}, title = {Mask DINO: Towards a Unified Transformer-Based Framework for Object Detection and Segmentation}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {3041-3050} } @article{yan2023bridging, title={Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object Tracking}, author={Yan, Feng and Luo, Weixin and Zhong, Yujie and Gan, Yiyang and Ma, Lin}, journal={arXiv preprint arXiv:2305.12724}, year={2023} } @InProceedings{Chen_2023_CVPR, author = {Chen, Fangyi and Zhang, Han and Hu, Kai and Huang, Yu-Kai and Zhu, Chenchen and Savvides, Marios}, title = {Enhanced Training of Query-Based Object Detection via Selective Query Recollection}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {23756-23765} } ```