ijkguo / mx-rcnn

Parallel Faster R-CNN implementation with MXNet.
Other
671 stars 290 forks source link

Faster R-CNN in MXNet

example detections

Set up environment

Out-of-box inference models

Download any of the following models to the current directory and run python3 demo.py --dataset $Dataset$ --network $Network$ --params $MODEL_FILE$ --image $YOUR_IMAGE$ to get single image inference. For example python3 demo.py --dataset voc --network vgg16 --params vgg16_voc0712.params --image myimage.jpg, add --gpu 0 to use GPU optionally. Different network has different configuration. Different dataset has different object class names. You must pass them explicitly as command line arguments.

Network Dataset Imageset Reference Result Link
vgg16 voc 07/07 69.9 70.23 Dropbox
vgg16 voc 07++12/07 73.2 75.97 Dropbox
resnet101 voc 07++12/07 76.4 79.35 Dropbox
vgg16 coco train2017/val2017 21.2 22.8 Dropbox
resnet101 coco train2017/val2017 27.2 26.1 Dropbox

Download data and label

Make a directory data and follow py-faster-rcnn for data preparation instructions.

Download pretrained ImageNet models

Training and evaluation

Use python3 train.py --dataset $Dataset$ --network $Network$ --pretrained $IMAGENET_MODEL_FILE$ --gpus $GPUS$ to train, for example, python3 train.py --dataset voc --network vgg16 --pretrained model/vgg16-0000.params --gpus 0,1. Use python3 test.py --dataset $Dataset$ --network $Network$ --params $MODEL_FILE$ --gpu $GPU$ to evaluate, for example, python3 test.py --dataset voc --network vgg16 --params model/vgg16-0010.params --gpu 0.

History

Disclaimer

This repository used code from MXNet, Fast R-CNN, Faster R-CNN, caffe, tornadomeet/mx-rcnn, MS COCO API.
Thanks to tornadomeet for end-to-end experiments and MXNet contributers for helpful discussions.

References

  1. Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. In Neural Information Processing Systems, Workshop on Machine Learning Systems, 2015
  2. Ross Girshick. "Fast R-CNN." In Proceedings of the IEEE International Conference on Computer Vision, 2015.
  3. Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. "Faster R-CNN: Towards real-time object detection with region proposal networks." In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016.
  4. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. "Caffe: Convolutional architecture for fast feature embedding." In Proceedings of the ACM International Conference on Multimedia, 2014.
  5. Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. "The pascal visual object classes (voc) challenge." International journal of computer vision 88, no. 2 (2010): 303-338.
  6. Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. "ImageNet: A large-scale hierarchical image database." In Computer Vision and Pattern Recognition, IEEE Conference on, 2009.
  7. Karen Simonyan, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
  8. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. "Deep Residual Learning for Image Recognition". In Computer Vision and Pattern Recognition, IEEE Conference on, 2016.
  9. Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. "Microsoft COCO: Common Objects in Context" In European Conference on Computer Vision, pp. 740-755. Springer International Publishing, 2014.