This is an experimental Tensor Flow implementation of Faster RCNN (TFFRCNN), mainly based on the work of smallcorgi and rbgirshick. I have re-organized the libraries under lib
path, making each of python modules independent to each other, so you can understand, re-write the code easily.
For details about R-CNN please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks by Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun.
Requirements for Tensorflow (see: Tensorflow)
Python packages you might not have: cython
, python-opencv
, easydict
(recommend to install: Anaconda)
Clone the Faster R-CNN repository
git clone https://github.com/CharlesShang/TFFRCNN.git
Build the Cython modules
cd TFFRCNN/lib
make # compile cython and roi_pooling_op, you may need to modify make.sh for your platform
After successfully completing basic installation, you'll be ready to run the demo.
To run the demo
cd $TFFRCNN
python ./faster_rcnn/demo.py --model model_path
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.
Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
Extract all of these tars into one directory named VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
It should have this basic structure
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories ...
Create symlinks for the PASCAL VOC dataset
cd $TFFRCNN/data
ln -s $VOCdevkit VOCdevkit2007
Download pre-trained model VGG16 and put it in the path ./data/pretrain_model/VGG_imagenet.npy
Run training scripts
cd $TFFRCNN
python ./faster_rcnn/train_net.py --gpu 0 --weights ./data/pretrain_model/VGG_imagenet.npy --imdb voc_2007_trainval --iters 70000 --cfg ./experiments/cfgs/faster_rcnn_end2end.yml --network VGGnet_train --set EXP_DIR exp_dir
Run a profiling
cd $TFFRCNN
# install a visualization tool
sudo apt-get install graphviz
./experiments/profiling/run_profiling.sh
# generate an image ./experiments/profiling/profile.png
Download the KITTI detection dataset
http://www.cvlibs.net/datasets/kitti/eval_object.php
Extract all of these tar into ./TFFRCNN/data/
and the directory structure looks like this:
KITTI
|-- training
|-- image_2
|-- [000000-007480].png
|-- label_2
|-- [000000-007480].txt
|-- testing
|-- image_2
|-- [000000-007517].png
|-- label_2
|-- [000000-007517].txt
Convert KITTI into Pascal VOC format
cd $TFFRCNN
./experiments/scripts/kitti2pascalvoc.py \
--kitti $TFFRCNN/data/KITTI --out $TFFRCNN/data/KITTIVOC
The output directory looks like this:
KITTIVOC
|-- Annotations
|-- [000000-007480].xml
|-- ImageSets
|-- Main
|-- [train|val|trainval].txt
|-- JPEGImages
|-- [000000-007480].jpg
Training on KITTIVOC
is just like on Pascal VOC 2007
python ./faster_rcnn/train_net.py \
--gpu 0 \
--weights ./data/pretrain_model/VGG_imagenet.npy \
--imdb kittivoc_train \
--iters 160000 \
--cfg ./experiments/cfgs/faster_rcnn_kitti.yml \
--network VGGnet_train