Closed henrivain closed 4 months ago
Follow progress in add-recognizion-iterator-functionality -branch
Branch now runs TesseractOcrMaui/TesseractTestClass.cs -> RunAsync(); at startup for easier testing during development.
Issue created from user request
Is it possible to get an array of lines rather than the whole text as a string ?
Tesseract returning hierarchy structure for OCR is (like Azure and friends do…):
Page
Block
Paragraph
Line
Word
The most efficient way would be a new function returning an array of Pages,
each page would have an array of Blocks, each block an array of paragraphs and so on.
Doing this, we could have a real analyze of the content, to extract some identified values,
and not only a “big string” where getting a word with its “meaning” is not really possible.
Uploaded nuget for IOS dll imports https://www.nuget.org/packages/TesseractOcrMaui.IOS/1.1.0
Result iterator
Tesseract result iterator gives more control over image result output.
Output
Result iterator gives access to different recognition block sizes that are
Implementation order
Page iterator
Tesseract page iterator gives access to text location in image. This is secondary milestone.