CJHMPower / Simultaneous-Traffic-Sign-Detection-and-Classification-with-RetinaNet

72 stars 28 forks source link

Simultaneous Traffic Sign Detection and Classification with RetinaNet

Overview

In this project, I implement an one-stage detection and classification model based on this paper Focal Loss for Dense Object Detection, to detect and classify traffic signs. This model was trained on the Tsinghua_Tecent_100K Dataset. After carefully model tuning, the RetinaNet finally achieved 90 MAP on 42 classes traffic signs on the test dataset, which is better than previous benchmarks.

Dependencies

How to run

To run predictions on pre-trained models

To run a traning experiment from scratch

Difference between original RetinaNet implementations

Training process

All the models were trained on two GTX-1080Ti GPU for 18 epochs with batch size of 8. I use Adam Optimizer with learning rate 1e-4 and the learning rate was decayed to 1e-5 for the last 6 epochs. It generally takes 20 hours for a RetinaNet model to converge.

Performance

As shown below, the RetinaNet-101 and RetinaNet-152 models achieve 92.03 and 92.80 MAP on the Tsinghua_Tecent_100K dataset respectively, which outperform the previous benchmark for simultanous traffic sign detection and classification. See evaluate/eval_check.ipynb for more details.

Models CVPR-2016 TT100K RetinaNet-50 RetinaNet-101 RetinaNet-152
MAP 0.8979 0.7939 0.9203 0.9280

Reference