Closed funkysoul closed 9 years ago
I believe what you need is to get the hOCR output rather than plain text. It will return an xml that has layout information. OpenOCR just recently added support to work correctly with Tesseract's hOCR output format.
I'm wondering if it's possible to recognize the x/y/width/height of the text scanned, taking a physical market receipt (those you get in a convenience store), you normally have some text in the middle then some text left and right aligned, would it be possible to recognize the relative position of every word/line/character? Great project, works like a charm!! :-)