This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.
Thanks for the author's (@whai362) awesome work!
trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):
baiduyun extract code: pffd
This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.
Database | Precision (%) | Recall (%) | F-measure (%) |
---|---|---|---|
ICDAR 2015(val) | 74.61 | 80.93 | 77.64 |
If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.
Then run train.py like:
python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/
If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)
Note:
run eval.py like:
python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/
a text file and result image will be then written to the output path.
If you encounter any issue check issues first, or you can open a new issue.
@rkshuai found a bug about concat features in model.py.
If this repository helps you,please star it. Thanks.