wanghaisheng / awesome-ocr

A curated list of promising OCR resources
http://wanghaisheng.github.io/ocr-arxiv-daily/
MIT License
1.66k stars 351 forks source link

Robust, Simple Page Segmentation using Hybrid Convolutional MDLSTM Networks #104

Closed wanghaisheng closed 6 years ago

wanghaisheng commented 6 years ago

Abstract—Analyzing and segmenting scanned documents isan important step in optical character recognition. The problemis difficult because of the complexity of 2D layouts, the small toleranceof segmentation errors in the output, and the relativelysmall amount of labeled training data available. Traditionalapproaches have relied on a combination of sophisticated geometricalgorithms, domain knowledge, heuristics, and carefullytuned parameters. This paper describes the use of deep neuralnetworks, in particular a combination of convolutional andmultidimensional LSTM networks, for document image anddemonstrates that relatively simple networks are capable of fast,reliable text line segmentation and document layout analysiseven on complex and noisy inputs, without manual parametertuning or heuristics. The method is easily adaptable to newdatasets by retraining and an open source implementation isavailable

Robust_ Simple Page Segmentation Using Hybrid Convolutional MDLSTM Networks.pdf