This repo re-implements Faster R-CNN fully on MXNet Gluon API, which supports batch size larger than one and Multi-GPU training. You can use the code to train/validate/test for object detection task.
More functions are in developing...
Note: This repo depends on MXNet version 1.2.1+, due to MXNet Symbol and Gluon Proposal API are inconsistent in previous version.
This repo requires Python3 with the following packages:
mxnet
tqdm
EasyDict
matplotlib
opencv-python
You may also need a GPU with at least 8GB memory for training.
git clone https://github.com/WalterMa/gluon-faster-rcnn
cd gluon-faster-rcnn
python ./demo_faster_rcnn.py
Currently, this repo only support voc2007/2012 dataset. But you could easily modify or create your own dataset by reference Gluon-CV dataset code, or generate and using record dataset.
Note: Record Dataset is only available in num_workers=0, due to MXNet issue.
We need the following three files from Pascal VOC:
Filename | Size | SHA-1 |
---|---|---|
VOCtrainval_06-Nov-2007.tar | 439 MB | 34ed68851bce2a36e2a223fa52c661d592c66b3c |
VOCtest_06-Nov-2007.tar | 430 MB | 41a8d6e12baa5ab18ee7f8f8029b9e11805b4ef1 |
VOCtrainval_11-May-2012.tar | 1.9 GB | 4e443f8a2eca6b1dac8a6c57641b67dd40621a49 |
Download and extract voc dataset to ./data/VOCdevkit/, or you need to specify dataset path in .utils/config.py or related python scripts.
Start e2e training and validating:
python ./train_faster_rcnn.py
Method | Network | Training Data | Testing Data | Reference | Result |
---|---|---|---|---|---|
Faster R-CNN end-to-end | VGG16 | VOC07+12 | VOC07test | 73.2 | - |
This is a re-implementation of original Faster R-CNN which is based on caffe. The arXiv paper is available here.
This repository used code from MXNet, Faster R-CNN, MX R-CNN, MXNet SSD, Gluon CV.