By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun at Microsoft Research
Faster R-CNN is an object detection framework based on deep convolutional networks, which includes a Region Proposal Network (RPN) and an Object Detection Network. Both networks are trained for sharing convolutional layers for fast testing.
Faster R-CNN was initially described in an arXiv tech report.
This repo contains a MATLAB re-implementation of Fast R-CNN. Details about Fast R-CNN are in: rbgirshick/fast-rcnn.
This code has been tested on Windows 7/8 64-bit, Windows Server 2012 R2, and Linux, and on MATLAB 2014a.
Python version is available at py-faster-rcnn.
Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).
If you find Faster R-CNN useful in your research, please consider citing:
@article{ren15fasterrcnn,
Author = {Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun},
Title = {{Faster R-CNN}: Towards Real-Time Object Detection with Region Proposal Networks},
Journal = {arXiv preprint arXiv:1506.01497},
Year = {2015}
}
training data | test data | mAP | time/img | |
---|---|---|---|---|
Faster RCNN, VGG-16 | VOC 2007 trainval | VOC 2007 test | 69.9% | 198ms |
Faster RCNN, VGG-16 | VOC 2007 trainval + 2012 trainval | VOC 2007 test | 73.2% | 198ms |
Faster RCNN, VGG-16 | VOC 2012 trainval | VOC 2012 test | 67.0% | 198ms |
Faster RCNN, VGG-16 | VOC 2007 trainval&test + 2012 trainval | VOC 2012 test | 70.4% | 198ms |
Note: The mAP results are subject to random variations. We have run 5 times independently for ZF net, and the mAPs are 59.9 (as in the paper), 60.4, 59.5, 60.1, and 59.5, with a mean of 59.88 and std 0.39.
Caffe
build for Faster R-CNN (included in this repository, see external/caffe
)
fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m
GPU: Titan, Titan Black, Titan X, K20, K40, K80.
fetch_data/fetch_caffe_mex_windows_vs2013_cuda65.m
to download a compiled Caffe mex (for Windows only).faster_rcnn_build.m
startup.m
fetch_data/fetch_faster_rcnn_final_model.m
to download our trained models.Run experiments/script_faster_rcnn_demo.m
to test a single demo image.
001763.jpg (500x375): time 0.201s (resize+conv+proposal: 0.150s, nms+regionwise: 0.052s)
004545.jpg (500x375): time 0.201s (resize+conv+proposal: 0.151s, nms+regionwise: 0.050s)
000542.jpg (500x375): time 0.192s (resize+conv+proposal: 0.151s, nms+regionwise: 0.041s)
000456.jpg (500x375): time 0.202s (resize+conv+proposal: 0.152s, nms+regionwise: 0.050s)
001150.jpg (500x375): time 0.194s (resize+conv+proposal: 0.151s, nms+regionwise: 0.043s)
mean time: 0.198s
and with ZF net:
001763.jpg (500x375): time 0.061s (resize+conv+proposal: 0.032s, nms+regionwise: 0.029s)
004545.jpg (500x375): time 0.063s (resize+conv+proposal: 0.034s, nms+regionwise: 0.029s)
000542.jpg (500x375): time 0.052s (resize+conv+proposal: 0.034s, nms+regionwise: 0.018s)
000456.jpg (500x375): time 0.062s (resize+conv+proposal: 0.034s, nms+regionwise: 0.028s)
001150.jpg (500x375): time 0.058s (resize+conv+proposal: 0.034s, nms+regionwise: 0.023s)
mean time: 0.059s
GPU / mean time | VGG-16 | ZF |
---|---|---|
K40 | 198ms | 59ms |
Titan Black | 174ms | 56ms |
Titan X | 151ms | 59ms |
fetch_data/fetch_model_ZF.m
to download an ImageNet-pre-trained ZF net.fetch_data/fetch_model_VGG16.m
to download an ImageNet-pre-trained VGG-16 net.experiments/script_faster_rcnn_VOC2007_ZF.m
to train a model with ZF net. It runs four steps as follows:
experiments/script_faster_rcnn_VOC2007_VGG16.m
to train a model with VGG net.
./experiments
for more settings.Note: This documentation may contain links to third party websites, which are provided for your convenience only. Such third party websites are not under Microsoft’s control. Microsoft does not endorse or make any representation, guarantee or assurance regarding any third party website, content, service or product. Third party websites may be subject to the third party’s terms, conditions, and privacy statements.
If the automatic "fetch_data" fails, you may manually download resouces from: