masyagin1998 / robin

RObust document image BINarization
MIT License
176 stars 37 forks source link
computer-vision deep-learning document-analysis document-binarization keras neural-networks ocr opencv python u-net

robin

robin is a RObust document image BINarization tool, written in Python.

Tech

robin uses a number of open source projects to work properly:

Installation

robin requires Python v3.5+ to run.

Get robin, install the dependencies from requirements.txt, download datasets and weights, and now You are ready to binarize documents!

$ git clone https://github.com/masyagin1998/robin.git
$ cd robin
$ pip install -r requirements.txt

HowTo

Robin

robin consists of two main files: src/unet/train.py, which generates weights for U-net model from input 128x128 pairs of original and ground-truth images, and src/unet/binarize.py for binarization group of input document images. Model works with 128x128 images, so binarization tool firstly splits input imags to 128x128 pieces. You can easily rewrite code for different size of U-net image, but researches show that 128 x 128 is the best size.

Metrics

You should know, how good is your binarization tool, so I made a script that automates calculation of four DIBCO metrics: F-measure, pseudo F-measure, PSNR and DRD: src/metrics/metrics.py. Unfortunately it requires two DIBCO tools: weights.exe and metrics.exe, which could be started only on Windows (I tried to run them on Linux with Wine, but couldn't, because one of their dependecies is matlab MCR 9.0 exe).

Dataset

It is realy hard to find good document binarization dataset (DBD), so here I give links to 3 datasets, marked up in a single convenient format. All input image names satisfy [\d]*_in.png regexp, and all ground-truth image names satisfy [\d]*_gt.png regexp.

Also I have some simple script - src/dataset/dataset.py and src/dataset/stsl-download.py. First can fastly generate train-validation-testing data from provided datasets, second can be used for getting interesting training data from the Trinity-Sergius Lavra official site. It is expected, that you train your simple robin on marked dataset, then create new dataset with stsl-download.py and binarize.py, correct generated ground-truths and train robin again with these new pair of input and ground-truth images.

Articles

While I was working on robin, I constantly read some scientific articles. Here I give links to all of them.

Weigths

Training neural network is not cheap, because you need powerful GPU and CPU, so I provide some pretrained weigths (For training I used two combinations: Nvidia 1050 Ti 4 Gb + Intel Core I7 7700 HQ + 8 Gb RAM and Nvidia 1080 Ti SLI + Intel Xeon E2650 + 128 Gb RAM).

Examples of work

Original image Binarized
in out
Original image Binarized
in out
Original image Binarized
in out
Original image Binarized
in out

Bugs

Many thanks to:

Referenced or mentioned by: