Multi-scale DSNs for degraded document image binarization
We propose a novel supervised binarization method by learning multi-scale deep supervised networks. Our model is directly trained from image regions using pixel values as inputs and the binary ground truth as labels. By extracting high-level features, the networks can differentiate text pixels from background noises, and thus can deal with severe degradations occurring in document images. Comparing to traditional algorithms, binary images generated by our method have a cleaner background and better-preserved strokes. The proposed approach achieves state-of-the-art results over widely used DIBCO datasets.
GPU: GTX 980Ti, GTX 1070.
To reproduce our results:
The pre-trained DSNs models are provided in conv_3, conv_4, and conv_5 folders. Run the python code multiscale_DSN.py for binarizing the document images. These pre-trained models are tested with H-DIBCO 2016 and DIBCO 2011 datasets.
$ python multiscale_DSN.py arg1 arg2
arg1: Input gray scale or color image
arg2: Resulting binary image
This code is based on Caffe and the implementation of DSN.
If you are using the code/model/data provided here in a publication, please cite our paper:
@proceedings{vqnhat,
Author = {Quang Nhat, Vo, Soo Hyung, Kim, Hyung Jeong, Yang, and Gueesang, Lee},
Title = {Binarization of degraded document images based on hierarchical deep supervised network},
Year = {2017},
Booktitle = {Pattern Recognition}
}