Sanster / tf_ctpn

Tensorflow CTPN
MIT License
39 stars 16 forks source link

tf_ctpn

A tensorflow implement of CTPN: Detecting Text in Natural Image with Connectionist Text Proposal Network.

Most of code in this project are adapted from CTPN, tf-faster-rcnn and text-detection-ctpn

The result of pretrained model on ICDAR13:

Net Dataset Recall Precision Hmean
Origin CTPN ICDAR13 training data + ? 73.72% 92.77% 82.15%
vgg16 MLT17 latin/chn + ICDAR13 training data 74.26% 82.46% 78.15%

If you want an end to end OCR service, check this repo: https://github.com/Sanster/DeepOcrService

Setup

Install dependencies:

pip3 install -r requirements.txt

Build Cython part for both demo and training.

cd lib/
make clean
make

Quick start

Download pre-trained CTPN model(based on vgg16) from google drive, put it in output/vgg16/voc_2007_trainval/default. Run

python3 tools/demo.py

This model is trained on 1080Ti with 80k iterations using this commit dc533e030e5431212c1d4dbca0bcd7e594a8a368.

Training

  1. Download training dataset from google drive. This dataset contain 3727 images from MLT17(latin+chinese) and ICDAR13 training set. Ground truth anchors are generated by minAreaRect of text area, see eragonruan/text-detection-ctpn#issues215 for more details.You can use tools/mlt17_to_voc.py to make your training data. Put downloaded data in ./data/VOCdevkit2007/VOC2007

  2. Download pre-trained slim vgg16 model from here Put the pretrained_models in ./data/pretrained_model

  3. Start training

    python3 tools/trainval_net.py

    The output checkpoint file will be saved at ./output/vgg16/voc_2007_trainval/default

  4. Start tensorboard

    tensorboard --logdir=./tensorboard

Test on ICDRA13

python3 tools/icdar.py --img_dir=path/to/ICDAR13/Challenge2_Test_Task12_Images/ -c=ICDAR13

After finish, a submit.zip file will generated in data/ICDAR_submit, than run:

cd tools/ICDAR13
# use python2
python script.py -g=gt.zip -s=submit.zip