This repository contains codes of the reimplementation of SSD: Single Shot MultiBox Detector in TensorFlow. If your goal is to reproduce the results in the original paper, please use the official codes.
There are already some TensorFlow based SSD reimplementation codes on GitHub, the main special features of this repo inlcude:
New Update(77.9%mAP): using absolute bbox coordinates instead of normalized coordinates, checkout here.
VOCROOT/
|->VOC2007/
| |->Annotations/
| |->ImageSets/
| |->...
|->VOC2012/
| |->Annotations/
| |->ImageSets/
| |->...
|->VOC2007TEST/
| |->Annotations/
| |->...
VOCROOT is your path of the Pascal VOC Dataset.
python dataset/convert_tfrecords.py --dataset_directory=VOCROOT --output_directory=./dataset/tfrecords
Run the following script to start training:
python train_ssd.py
Run the following script for evaluation and get mAP:
python eval_ssd.py
python voc_eval.py
Note: you need first modify some directory in voc_eval.py.
python simple_ssd_demo.py
All the codes was tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04 with CUDA 8.0. If you want to run training by yourself, one decent GPU will be highly recommended. The whole training process for VOC07+12 dataset took ~120k steps in total, and each step (32 samples per-batch) took ~1s on my little workstation with single GTX1080-Ti GPU Card. If you need run training without enough GPU memory you can try half of the current batch size(e.g. 16), try to lower the learning rate and run more steps, watching the TensorBoard until convergency. BTW, the codes here had also been tested under TensorFlow 1.4 with CUDA 8.0, but some modifications to the codes are needed to enable replicate model training, take following steps if you need:
This repo is just created recently, any contribution will be welcomed.
This implementation(SSD300-VGG16) yield mAP 77.8% on PASCAL VOC 2007 test dataset(the original performance described in the paper is 77.2%mAP), the details are as follows:
sofa | bird | pottedplant | bus | diningtable | cow | bottle | horse | aeroplane | motorbike |
---|---|---|---|---|---|---|---|---|---|
78.9 | 76.2 | 53.5 | 85.2 | 75.5 | 85.0 | 48.6 | 86.7 | 82.2 | 83.4 |
sheep | train | boat | bicycle | chair | cat | tvmonitor | person | car | dog |
82.4 | 87.6 | 72.7 | 83.0 | 61.3 | 88.2 | 74.5 | 79.6 | 85.3 | 86.4 |
You can download the trained model(VOC07+12 Train) from GoogleDrive for further research.
For Chinese friends, you can also download both the trained model and pre-trained vgg16 weights from BaiduYun Drive, access code: tg64.
Here is the training logs and some detection results:
Nan loss during training
tf.app.flags.DEFINE_string(
'decay_boundaries', '2000, 80000, 100000',
'Learning rate decay boundaries by global_step (comma-separated list).')
tf.app.flags.DEFINE_string(
'lr_decay_factors', '0.1, 1, 0.1, 0.01',
'The values of learning_rate decay factor for each segment between boundaries (comma-separated list).')
Use this bibtex to cite this repository:
@misc{kapok_ssd_2018,
title={Single Shot MultiBox Detector in TensorFlow},
author={Changan Wang},
year={2018},
publisher={Github},
journal={GitHub repository},
howpublished={\url{https://github.com/HiKapok/SSD.TensorFlow}},
}
Welcome to join in QQ Group(758790869) for more discussion
Apache License, Version 2.0