This is the official implementation of the paper "Less is More: Focus Attention for Efficient DETR"
Authors: Dehua Zheng, Wenhui Dong, Hailin Hu, Xinghao Chen, Yunhe Wang.
Focus-DETR is a model that focuses attention on more informative tokens for a better trade-off between computation efficiency and model accuracy. Compared with the state-of-the-art sparse transformed-based detector under the same setting, our Focus-DETR gets comparable complexity while achieving 50.4AP (+2.2) on COCO.
Name | Backbone | Pretrain | Epochs | Denoising Queries | box AP |
download |
---|---|---|---|---|---|---|
Focus-DETR-R50-4scale | R-50 | IN1k | 12 | 100 | 48.8 | model |
Focus-DETR-R50-4scale | R-50 | IN1k | 24 | 100 | 50.3 | model |
Focus-DETR-R50-4scale | R-50 | IN1k | 36 | 100 | 50.4 | model |
Focus-DETR-R101-4scale | R-101 | IN1k | 12 | 100 | 50.8 | model |
Focus-DETR-R101-4scale | R-101 | IN1k | 24 | 100 | 51.2 | model |
Focus-DETR-R101-4scale | R-101 | IN1k | 36 | 100 | 51.4 | model |
Name | Backbone | Pretrain | Epochs | Denoising Queries | box AP |
download |
---|---|---|---|---|---|---|
Focus-DETR-Swin-T-224-4scale | Swin-Tiny-224 | IN1k | 12 | 100 | 50.0 | model |
Focus-DETR-Swin-T-224-4scale | Swin-Tiny-224 | IN1k | 24 | 100 | 51.2 | model |
Focus-DETR-Swin-T-224-4scale | Swin-Tiny-224 | IN1k | 36 | 100 | 52.5 | model |
Focus-DETR-Swin-T-224-4scale | Swin-Tiny-224 | IN22k to IN1k | 36 | 100 | 53.2 | model |
Focus-DETR-Swin-B-384-4scale | Swin-Base-384 | IN22k to IN1k | 36 | 100 | 56.2 | model |
Focus-DETR-Swin-L-384-4scale | Swin-Large-384 | IN22k to IN1k | 36 | 100 | 56.3 | model |