GAIA-vision/GAIA-det - Githubissues

GAIA-det ^^^^^^ More models and demos coming soon! Stay tuned.

Introduction

GAIA-det is an open source object detection toolbox that helps you with your customized AI solutions. It is built on top of gaiavision and mmdet. This repo includes an official re-implementation of our CVPR2021 paper:

GAIA: A Transfer Learning System of Object Detection that Fits Your Needs <https://arxiv.org/abs/2106.11346>__.

.. _gaiavision: https://github.com/GAIA-vision/GAIA-cv .. _mmdet: https://github.com/open-mmlab/mmdetection

It provides functionalities that help the customization of AI solutions.

Design customized search space of any type with little efforts.
Manage models in search space according to your rules.
Integrate datasets of various sources.

Requirements

Python 3.6+
CUDA 10.0+
1.2.7 <= mmcv < 1.3.0
2.8.0 <= mmdet < 2.9.0
Others (See requirements.txt)

Installation

Install gaiavision_.
Install mmdet_.
Install gaiadet:

.. code-block:: bash

git clone https://github.com/GAIA-vision/GAIA-det.git && cd GAIA-det pip install -r requirements.txt pip install -e .

Prepare Supernet

Benchmark

Finetuning(Upstream-COCO)


+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| Backbone   | Pretrain   | Model      | Depth         | Width                | Input       | Lr        | FLOPS      |  box AP          |  box AP              |
|            |            |            |               |                      | Scale       | schd      |            |  (paper)         |  (repo)              |
+============+============+============+===============+======================+=============+===========+============+==================+======================+
| ResNet50   | ImageNet   | Faster     | 3, 4, 6, 3    |64, 64, 128, 256, 512 | 800         | 1x        | 139G       |   37.1           |   37.6               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| ResNet50   | ImageNet   | Faster     | 3, 4, 6, 3    |64, 64, 128, 256, 512 | 800         | 4x        | 139G       |   None           |   40.3               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 45-50GF    | GAIADET    | Faster     | 2, 4, 5, 3    |64, 64, 96, 192, 384  | 480         | 1x        | 49G        |   40.4           |   40.7               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 70-75GF    | GAIADET    | Faster     | 4, 6, 27, 4   |48, 64, 128, 192, 512 | 480         | 1x        | 71G        |   42.6           |   43.1               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 85-90GF    | GAIADET    | Faster     | 3, 4, 21, 4   |48, 64, 160, 192, 640 | 560         | 1x        | 90G        |   43.6           |   44.4               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 110-115GF  | GAIADET    | Faster     | 2, 4, 25, 4   |64, 64, 160, 192, 640 | 640         | 1x        | 115G       |   44.5           |   44.8               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 135-140GF  | GAIADET    | Faster     | 4, 4, 15, 4   |48, 48, 128, 192, 512 | 800         | 1x        | 139G       |   45.3           |   45.6               |
+------------+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
We compare our results with ResNet50 of 4x on COCO for fairness, because COCO data has been used for 3x during upstream training.  

Compatibility with other methods

+------------+------------+------------+---------------+----------------------+-------------+-----------+-------------+------------------+----------------------+ | Backbone | Pretrain | Model | Depth | Width | Input | Lr | Methods | box AP | box AP | | | | | | | Scale | schd | | (paper) | (repo) | +============+============+============+===============+======================+=============+===========+=============+==================+======================+ | ResNet50 | ImageNet | Faster | 3, 4, 6, 3 |64, 64, 128, 256, 512 | 800 | 1x | N | 37.1 | 37.6 | +------------+------------+------------+---------------+----------------------+-------------+-----------+-------------+------------------+----------------------+ | ResNet50 | ImageNet | Faster | 3, 4, 6, 3 |64, 64, 128, 256, 512 | 800 | 1x | Y | 45.8 | 44.5 | +------------+------------+------------+---------------+----------------------+-------------+-----------+-------------+------------------+----------------------+ | 135-140GF | GAIADET | Faster | 4, 4, 15, 4 |48, 48, 128, 192, 512 | 800 | 1x | N | 45.3 | 45.6 | +------------+------------+------------+---------------+----------------------+-------------+-----------+-------------+------------------+----------------------+ | 135-140GF | GAIADET | Faster | 4, 4, 15, 4 |48, 48, 128, 192, 512 | 800 | 1x | Y | 49.1 | 48.5 | +------------+------------+------------+---------------+----------------------+-------------+-----------+-------------+------------------+----------------------+ Methods denote Deformable Convolution and Cascaded Head.

Finetuning(Downstream-BDD100k)

+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| Backbone   | Model      | Depth         | Width                | Input       | Lr        | FLOPS      |  box AP          |  box AP              |
|            |            |               |                      | Scale       | schd      |            |  (paper)         |  (repo)              |
+============+============+===============+======================+=============+===========+============+==================+======================+
| ResNet50   | Faster     | 3, 4, 6, 3    |64, 64, 128, 256, 512 | 800         | 1x        | 139G       |   None           |   30.1               |
+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 45-50GF    | Faster     | 3, 4, 5, 2    |48, 64, 96, 192, 384  | 480         | 1x        | 49G        |   None           |   27.4               |
+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 70-75GF    | Faster     | 4, 2, 15, 2   |48, 48, 128, 192, 512 | 560         | 1x        | 71G        |   None           |   29.5               |
+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 85-90GF    | Faster     | 2, 2, 15, 3   |64, 64, 128, 192, 384 | 640         | 1x        | 87G        |   None           |   32.1               |
+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+
| 135-140GF  | Faster     | 4, 6, 23, 3   |48, 80, 128, 192, 512 | 720         | 1x        | 139G       |   None           |   32.9               |
+------------+------------+---------------+----------------------+-------------+-----------+------------+------------------+----------------------+

Finetuning(Downstream-UODB)

+------------------+-------+------+-----------+------+---------+------+------------+-------+---------+------------+------+ | Dataset | KITTI | VOC | WiderFace | LISA | Kitchen | DOTA | DeepLesion | Comic | Clipart | Watercolor | Avg. | +==================+=======+======+===========+======+=========+======+============+=======+=========+============+======+ | ResNet50(paper) | 67.1 | 81.5 | 62.1 | 90.0 | 89.5 | 68.3 | 57.4 | 45.5 | 31.2 | 53.4 | 64.6 | +------------------+-------+------+-----------+------+---------+------+------------+-------+---------+------------+------+ | GAIA(paper) | 75.6 | 87.4 | 62.7 | 92.1 | 90.1 | 70.8 | 62.1 | 61.1 | 72.2 | 69.7 | 74.4 | +------------------+-------+------+-----------+------+---------+------+------------+-------+---------+------------+------+ | ResNet50(repo) | | | | | | | | | | | | +------------------+-------+------+-----------+------+---------+------+------------+-------+---------+------------+------+ | GAIA(repo) | | | | | | | | | | | | +------------------+-------+------+-----------+------+---------+------+------------+-------+---------+------------+------+ FLOPS of all models are around 139GFLOPS, and the metric used above is AP50.

Data Preparation

Please refer to DATAPREPARATION.

.. _DATA_PREPARATION: https://github.com/GAIA-vision/GAIA-det/blob/master/docs/DATA_PREPARATION.rst

Usage

Please refer to USAGE_ for generic use.

.. _USAGE: https://github.com/GAIA-vision/GAIA-det/blob/master/docs/USAGE.rst

Citation

If you like our work and use the code or models for your research or project, please star our repo and cite our work as follows.

@InProceedings{Bu_2021_CVPR,
    author    = {Bu, Xingyuan* and Peng, Junran* and Yan, Junjie and Tan, Tieniu and Zhang, Zhaoxiang},
    title     = {GAIA: A Transfer Learning System of Object Detection That Fits Your Needs},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {274-283}
}