tf_ctpn

Most of code in this project are adapted from CTPN, tf-faster-rcnn and text-detection-ctpn

The result of pretrained model on ICDAR13:

Net	Dataset	Recall	Precision	Hmean
Origin CTPN	ICDAR13 training data + ?	73.72%	92.77%	82.15%
vgg16	MLT17 latin/chn + ICDAR13 training data	74.26%	82.46%	78.15%

If you want an end to end OCR service, check this repo: https://github.com/Sanster/DeepOcrService

Setup

Install dependencies:

pip3 install -r requirements.txt

Build Cython part for both demo and training.

cd lib/
make clean
make

Download pre-trained CTPN model(based on vgg16) from google drive, put it in output/vgg16/voc_2007_trainval/default. Run

python3 tools/demo.py

This model is trained on 1080Ti with 80k iterations using this commit dc533e030e5431212c1d4dbca0bcd7e594a8a368.

Download training dataset from google drive. This dataset contain 3727 images from MLT17(latin+chinese) and ICDAR13 training set. Ground truth anchors are generated by minAreaRect of text area, see eragonruan/text-detection-ctpn#issues215 for more details.You can use tools/mlt17_to_voc.py to make your training data. Put downloaded data in ./data/VOCdevkit2007/VOC2007
Download pre-trained slim vgg16 model from here Put the pretrained_models in ./data/pretrained_model
Start training
```
python3 tools/trainval_net.py
```
The output checkpoint file will be saved at ./output/vgg16/voc_2007_trainval/default
Start tensorboard
```
tensorboard --logdir=./tensorboard
```

python3 tools/icdar.py --img_dir=path/to/ICDAR13/Challenge2_Test_Task12_Images/ -c=ICDAR13

After finish, a submit.zip file will generated in data/ICDAR_submit, than run:

cd tools/ICDAR13
# use python2
python script.py -g=gt.zip -s=submit.zip