muditchaudhary / RepPoints-x-Libra-R-CNN-x-Transformer-self-attention

Improving RepPoints object detection model using transformer attention and training balancing methods
MIT License
1 stars 0 forks source link
object-detection

RepPoints x Libra-RCNN x Transformer Attention

This repository builds upon the RepPoints model by Yang et.al. We balance the training of the model using Libra R-CNN and then further improve the performance using Transformer Attention.
We also implement our KQRAttention mechanism to improve the inference speed of the model.

RepPoints Object Detection model

Ze Yang, Shaohui Liu, and Han Hu.

We provide code support and configuration files to reproduce the results in the paper for "RepPoints: Point Set Representation for Object Detection" on COCO object detection. Our code is based on mmdetection, which is a clean open-sourced project for benchmarking object detection methods.

Introduction

RepPoints, initially described in arXiv, is a new representation method for visual objects, on which visual understanding tasks are typically centered. Visual object representation, aiming at both geometric description and appearance feature extraction, is conventionally achieved by bounding box + RoIPool (RoIAlign). The bounding box representation is convenient to use; however, it provides only a rectangular localization of objects that lacks geometric precision and may consequently degrade feature quality. Our new representation, RepPoints, models objects by a point set instead of a bounding box, which learns to adaptively position themselves over an object in a manner that circumscribes the object’s spatial extent and enables semantically aligned feature extraction. This richer and more flexible representation maintains the convenience of bounding boxes while facilitating various visual understanding applications. This repo demonstrated the effectiveness of RepPoints for COCO object detection.

Another feature of this repo is the demonstration of an anchor-free detector, which can be as effective as state-of-the-art anchor-based detection methods. The anchor-free detector can utilize either bounding box or RepPoints as the basic object representation.

Learning RepPoints in Object Detection.

Usage

a. Clone the repo:

git clone --recursive https://github.com/microsoft/RepPoints

b. Download the COCO detection dataset, copy RepPoints src into mmdetection and install mmdetection.

sh ./init.sh

c. Run experiments with a speicific configuration file:

./mmdetection/tools/dist_train.py ${path-to-cfg-file} ${num_gpu} --validate

We give one example here:

./mmdetection/tools/dist_train.py ./configs/reppoints_moment_r101_fpn_2x_mt.py 8 --validate

Citing RepPoints

@inproceedings{yang2019reppoints,
  title={RepPoints: Point Set Representation for Object Detection},
  author={Yang, Ze and Liu, Shaohui and Hu, Han and Wang, Liwei and Lin, Stephen},
  booktitle={The IEEE International Conference on Computer Vision (ICCV)},
  month={Oct},
  year={2019}
}

Results and models

The results on COCO 2017val are shown in the table below.

Method Backbone Anchor convert func Lr schd box AP Download
BBox R-50-FPN single - 1x 36.3 model
BBox R-50-FPN none - 1x 37.3 model
RepPoints R-50-FPN none partial MinMax 1x 38.1 model
RepPoints R-50-FPN none MinMax 1x 38.2 model
RepPoints R-50-FPN none moment 1x 38.2 model
RepPoints R-50-FPN none moment 2x 38.6 model
RepPoints R-50-FPN none moment 2x (ms train) 40.8 model
RepPoints R-50-FPN none moment 2x (ms train&ms test) 42.2
RepPoints R-101-FPN none moment 2x 40.3 model
RepPoints R-101-FPN none moment 2x (ms train) 42.3 model
RepPoints R-101-FPN none moment 2x (ms train&ms test) 44.1
RepPoints R-101-FPN-DCN none moment 2x 43.0 model
RepPoints R-101-FPN-DCN none moment 2x (ms train) 44.8 model
RepPoints R-101-FPN-DCN none moment 2x (ms train&ms test) 46.4
RepPoints X-101-FPN-DCN none moment 2x 44.5 model
RepPoints X-101-FPN-DCN none moment 2x (ms train) 45.6 model
RepPoints X-101-FPN-DCN none moment 2x (ms train&ms test) 46.8

Notes:

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.