Official Tensorflow implementation of drl-RPN by Aleksis Pirinen (email: aleksis.pirinen@ri.se) and Cristian Sminchisescu. The associated CVPR 2018 paper can be accessed here. A video demonstrating this work can be seen here.
The drl-RPN model is implemented on top of the publicly available TensorFlow VGG-16-based Faster R-CNN implementation by Xinlei Chen available here. See also the associated technical report An Implementation of Faster RCNN with Study for Region Sampling, as well as the original Faster R-CNN paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
git clone https://github.com/aleksispi/drl-rpn-tf.git
The current code supports VGG16 models. Exactly as for the Faster R-CNN implementation by Xinlei Chen, we report numbers using a single model on a single convolution layer, so no multi-scale, no multi-stage bounding box regression, no skip-connection, no extra input is used. The only data augmentation technique is left-right flipping during training following the original Faster R-CNN.
We first re-ran some of the experiments reported here for Faster R-CNN, but trained the models longer to obtain further performance gains for our baseline models. We got:
The corresponding results when using our drl-RPN detector with exploration penalty 0.05 during inference (models trained over different exploration penalties, as described in Section 5.1.2 in the paper) and posterior class-probability adjustments (Section 4.2 in our paper):
Tabular result representation
Model | mAP - VOC 2007 | mAP - VOC 2012 |
---|---|---|
RPN | 76.5 | 74.2 |
drl-RPN | 77.5 | 74.9 |
drl-RPN (np) | 77.2 | 74.6 |
drl-RPN (12-fix) | 77.6 | 75.0 |
Note:
reward_functions.py
for details.All pretrained models (both Faster R-CNN baseline and our drl-RPN models) for the numbers reported above in Detection Performance are available:
See "Setup data" on this page. Essentially download the dataset you are interested (e.g. PASCAL VOC), and add soft links in the data
folder in the appropriate way (see https://askubuntu.com/questions/56339/how-to-create-a-soft-or-symbolic-link for generic how-to for setting soft links).
experiments/scripts/train_drl_rpn.sh
. Setup SAVE_PATH
and WEIGHT_PATH
appropriately, and run the command
./experiments/scripts/train_drl_rpn.sh 0 pascal_voc_0712 1 20000 0 110000
to start training on VOC 2007+2012 trainval on GPU-id 0 for a total of 110k iterations (see code for more details). This will yield a drl-RPN model trained over two exploration penalties, enabling setting the speed-accuracy trade-off at test time. See also experiments/cfgs/drl-rpn-vgg16.yml
for some settings.WEIGHTS_PATH
variable in train_drl_rpn.sh
points to your drl-RPN model weights obtained in step 3 above. Then run ./experiments/scripts/train_drl_rpn.sh 0 pascal_voc_0712 1 0 1 110000
to train the posterior class-probability adjustment module for 110k iterations.experiments/scripts/test_drl_rpn.sh
. To test your model on the Pascal VOC 2007 test set on GPU-id 0, run ./experiments/scripts/test_drl_rpn.sh 0 pascal_voc_0712 1 1 0
(see code for more details). If you want to change the exploration-accuracy trade-off parameter, see experiments/cfgs/drl-rpn-vgg16.yml
. You may also specify whether you want to visualize drl-RPN search trajectories here (visualizations are saved in the top folder).Here are solutions to some potential issues:
import pycocotools._mask as _mask
ImportError: No module named _mask
Command exited with non-zero status 1
data/coco/PythonAPI/
and run make
in your ubuntu terminal. Now it should be fine! If you find this implementation or our CVPR 2018 paper interesting or helpful, please consider citing:
@inproceedings{pirinen2018deep,
title={Deep reinforcement learning of region proposal networks for object detection},
author={Pirinen, Aleksis and Sminchisescu, Cristian},
booktitle={proceedings of the IEEE conference on computer vision and pattern recognition},
pages={6945--6954},
year={2018}
}