dmlc / gluon-cv

Gluon CV Toolkit
http://gluon-cv.mxnet.io
Apache License 2.0
5.82k stars 1.21k forks source link

Libra R-CNN Detection Model #1733

Open 315386775 opened 2 years ago

315386775 commented 2 years ago

Paper: Libra R-CNN: Towards Balanced Learning for Object Detection. Link: https://arxiv.org/abs/1904.02701 Introduction: It integrates two novel components: balanced feature pyramid, and balanced L1 loss, respectively for reducing the imbalance at feature, and objective level. Benefitted from the overall balanced design, Libra R-CNN significantly improves the detection performance. Without bells and whistles, it achieves 2.5 points and 2.0 points higher Average Precision (AP) than FPN Faster R-CNN and RetinaNet respectively on MSCOCO.

315386775 commented 2 years ago

Models of VOC dataset are evaluated with native resolutions with shorter side >= 600 but longer side <= 1000 without changing aspect ratios. This pr is balance loss version

detection module map
faster_rcnn_resnet50_v1b_voc 78.3
libra_rcnn_balance 79.8
libra_rcnn_voc 80.9
github-actions[bot] commented 2 years ago

Job PR-1733-adc4358 is done. Docs are uploaded to http://gluon-vision-staging.s3-website-us-west-2.amazonaws.com/PR-1733/adc4358/index.html

bryanyzhu commented 2 years ago

This PR looks good. May I ask what is the difference between this PR and PR #1727 ? It seems this PR doesn't have the corresponding model definitions. Should I close #1727 or merge both PRs? Thank you.

315386775 commented 2 years ago

@bryanyzhu thanks for your reply. i got the bug with the code after i pull the code in the PR #1727 . i will close the #1727 and re-upload the balance fpn module. This PR's pretrained model and log. i will upload to the google drive soon.