Open leo-p opened 7 years ago
Region-based Convolutional Neural Network (R-CNN), basically takes as input and image and several possibles objects (corresponding to Region of Interest) and score each of them.
The feature map is computed for the whole image and then for each region of interest a new fixed-length feature vector is computed using max-pooling. From it two predictions are made for classification and bounding-box offsets.
By sharing computation for RoIs of the same image and allowing simple SGD training it really improves performance training although at testing it's still not as fast as YOLO9000.
https://arxiv.org/pdf/1504.08083v2.pdf
This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Compared to previous work, Fast R-CNN employs several innovations to improve training and testing speed while also increasing detection accuracy. Fast R-CNN trains the very deep VGG16 network 9x faster than R-CNN, is 213x faster at test-time, and achieves a higher mAP on PASCAL VOC 2012. Compared to SPPnet, Fast R-CNN trains VGG16 3x faster, tests 10x faster, and is more accurate. Fast R-CNN is implemented in Python and C++ (using Caffe) and is available under the open-source MIT License at this https URL