Faster-RCNN-Torch

Torch version of Faster RCNN model with ROI and Bilinear ROI Pooling of region proposals. Essential modules have been adapted from the Densecap repository. This work was carried out with the department Informatik6 at RWTH Aachen university under the supervision of Mr.Harald Hanselmann, M.Sc

Dependencies

Required:

1) Torch

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps;
./install.sh

2) After installing torch, you can install / update these dependencies by running the following:

luarocks install cutorch
luarocks install cunn
luarocks install cudnn
luarocks install lua-cjson
luarocks install hdf5
luarocks install cv #Requires OpenCV 3.1 TODO: Remove this and use the torch image module

Optional:

To use bilinear ROI pooling:

luarocks install stnbhwd

Pre-trained models for initialization

Only VGG16 and VGG1024 models are currently supported. To convert pre-trained caffe versions of imagenet or py-faster-rcnn models to lua tables, see here. Alternatively, you can download the following torch compatable versions :

Imagenet (VGG16 + FCN)
FasterRCNN (VGG16 + RPN + FCN)

After creating / downloading the torch version of pretrained model, set the corresponding model path in 'config.lua' :

init_model.vgg16 = "" #Imagenet VGG16
init_model.frcnn_vgg16 = "init_models/frcnn_vgg16.t7" #Faster RCNN VGG16
init_model.vgg1024 = "" #Imagenet VGG1024
init_model.frcnn_vgg1024 = "" #Faster RCNN VGG1024

Running the script

The script 'run.lua' is the starting point for our object detection task. To see the available options, hit

th run.lua -h

Examples

1)To train a faster-rcnn VGG16 model,

initialized with imagenet VGG16 model.
with usual ROI pooling of region proposals used in py-faster-rcnn.
for 100K iterations stepping down the learning by 0.1 every 50K iterations.
writing checkpoint every 10K iterations to 'checkpoint.t7'.

th run.lua -max_iters 100000 -step 50000 -gamma 0.1 -save_checkpoint_every 10000 -checkpoint_path checkpoint.t7 -seed 1432

2) To fine-tune the model by initiliazing with the caffe trained Faster-RCNN VGG16 model, use the option -init_rpn. Before using this option, make sure that the torch Faster RCNN VGG16 model has been created and the path has been set in config.lua

th run.lua -init_rpn

3) To use bilinear ROI pooling on imagenet initialized VGG16 model,

th run.lua -bilinear

4) To continue from a checkpoint,

th run.lua -checkpoint_start_from checkpoint.t7 -bilinear

5) To use caffe trained faster rcnn VGG1024 model

th run.lua -init_rpn -vgg1024

6) To run only the evaluation of a saved checkpoint

th run.lua -checkpoint_start_from checkpoint.t7 -eval

Performance

The mAP on Pascal VOC 2007 test set for the pre-trained Faster-RCNN VGG16 model (i.e, trained for 100K iteration with py-faster-rcnn achieveing 69.1%) in torch is 68.2%.

To run only the evaluation of caffe trained faster-rcnn VGG16 model (make sure that the torch compatable Faster-RCNN VGG16 model is available)

th run.lua -init_rpn -eval

The performance metrics used are described here. Training or further finetuning caffe models in torch currently does not improve this performance.

as641651 / Faster-RCNN-Torch

readme