Abstract—Analyzing and segmenting scanned documents isan important step in optical character recognition. The problemis difficult because of the complexity of 2D layouts, the small toleranceof segmentation errors in the output, and the relativelysmall amount of labeled training data available. Traditionalapproaches have relied on a combination of sophisticated geometricalgorithms, domain knowledge, heuristics, and carefullytuned parameters. This paper describes the use of deep neuralnetworks, in particular a combination of convolutional andmultidimensional LSTM networks, for document image anddemonstrates that relatively simple networks are capable of fast,reliable text line segmentation and document layout analysiseven on complex and noisy inputs, without manual parametertuning or heuristics. The method is easily adaptable to newdatasets by retraining and an open source implementation isavailable
Abstract—Analyzing and segmenting scanned documents isan important step in optical character recognition. The problemis difficult because of the complexity of 2D layouts, the small toleranceof segmentation errors in the output, and the relativelysmall amount of labeled training data available. Traditionalapproaches have relied on a combination of sophisticated geometricalgorithms, domain knowledge, heuristics, and carefullytuned parameters. This paper describes the use of deep neuralnetworks, in particular a combination of convolutional andmultidimensional LSTM networks, for document image anddemonstrates that relatively simple networks are capable of fast,reliable text line segmentation and document layout analysiseven on complex and noisy inputs, without manual parametertuning or heuristics. The method is easily adaptable to newdatasets by retraining and an open source implementation isavailable
Robust_ Simple Page Segmentation Using Hybrid Convolutional MDLSTM Networks.pdf