robin is a RObust document image BINarization tool, written in Python.
robin uses a number of open source projects to work properly:
robin requires Python v3.5+ to run.
Get robin, install the dependencies from requirements.txt, download datasets and weights, and now You are ready to binarize documents!
$ git clone https://github.com/masyagin1998/robin.git
$ cd robin
$ pip install -r requirements.txt
robin consists of two main files: src/unet/train.py
, which generates weights for U-net model from input 128x128 pairs of
original and ground-truth images, and src/unet/binarize.py
for binarization group of input document images. Model works with 128x128 images, so binarization tool firstly splits input imags to 128x128 pieces. You can easily rewrite code for different size of U-net image, but researches show that 128 x 128 is the best size.
You should know, how good is your binarization tool, so I made a script that automates calculation of four DIBCO metrics: F-measure, pseudo F-measure, PSNR and DRD: src/metrics/metrics.py
. Unfortunately it requires two DIBCO tools: weights.exe
and metrics.exe
, which could be started only on Windows (I tried to run them on Linux with Wine, but couldn't, because one of their dependecies is matlab MCR 9.0 exe
).
It is realy hard to find good document binarization dataset (DBD), so here I give links to 3 datasets, marked up in a single convenient format. All input image names satisfy [\d]*_in.png
regexp, and all ground-truth image names satisfy [\d]*_gt.png
regexp.
Also I have some simple script - src/dataset/dataset.py
and src/dataset/stsl-download.py
. First can fastly generate train-validation-testing data from provided datasets, second can be used for getting interesting training data from the Trinity-Sergius Lavra official site. It is expected, that you train your simple robin on marked dataset, then create new dataset with stsl-download.py
and binarize.py
, correct generated ground-truths and train robin again with these new pair of input and ground-truth images.
While I was working on robin, I constantly read some scientific articles. Here I give links to all of them.
Training neural network is not cheap, because you need powerful GPU and CPU, so I provide some pretrained weigths (For training I used two combinations: Nvidia 1050 Ti 4 Gb + Intel Core I7 7700 HQ + 8 Gb RAM
and Nvidia 1080 Ti SLI + Intel Xeon E2650 + 128 Gb RAM
).
DIBCO
and borders
data for 256 epochs with batchsize 128 and enabled augmentation. IT IS TRAINED FOR A4 300 DPI Images, so Your input data must have good resolution;Original image | Binarized |
---|---|
Original image | Binarized |
---|---|
Original image | Binarized |
---|---|
Original image | Binarized |
---|---|
Keras
has some problems with parallel data augmentation: it creates too many processes. I hope it will be fixed soon, but now it is better to use zero value of --extraprocesses
flag (default value);