ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018

Keras-RetinaNet for Open Images Challenge 2018

This code was used to get 15th place in Kaggle Google AI Open Images - Object Detection Track competition: https://www.kaggle.com/c/google-ai-open-images-object-detection-track/leaderboard

Repository contains the following:

http://nn-box.com/box/ - upload image wait several seconds and it will show boxes. ResNet152 is used as backbone.

Python 3.5, Keras 2.3.1, Keras-RetinaNet 0.5.1

There are 3 RetinaNet models based on ResNet50, ResNet101 and ResNet152 for 443 classes (only Level 1).

Backbone	Image Size (px)	Model (training)	Model (inference)	Small validation mAP	Full validation mAP
ResNet50	768 - 1024	533 MB	178 MB	0.4621	0.3520
ResNet101	768 - 1024	739 MB	247 MB	0.5031	0.3870
ResNet152	600 - 800	918 MB	308 MB	0.5194	0.3959

Model (training) - can be used to resume training or can be used as pretrain for your own classifier
Model (inference) - can be used to get prediction boxes for arbitrary images

There are 3 RetinaNet models based on ResNet50, ResNet101 and ResNet152 for all 500 classes.

Backbone	Image Size (px)	Model (training)	Model (inference)	Small validation mAP	LB (Public)
ResNet50	768 - 1024	534 MB	178 MB	0.4594	0.4223
ResNet101	768 - 1024	752 MB	251 MB	0.4986	0.4520
ResNet152	600 - 800	932 MB	312 MB	0.4991	0.4651

You need to change files_to_process = glob.glob(DATASET_PATH + 'validation_big/*.jpg') to your own set of files. On output you will get "predictions_*.csv" file with boxes.

Having Level 1 predictions you can expand it to all 500 classes using code from create_higher_level_predictions_from_level_1_predictions_csv.py

For training you need to download OID dataset (~500 GB images): https://storage.googleapis.com/openimages/web/challenge.html

Then to train on OID dataset you need to run python files in following order:

then

If you have predictions from several models, for example for ResNet101 and ResNet152 backbones, then you can ensemble boxes with script:

Proposed method increases the overall performance: