xwal / Tesseract-OCR-iOS

Tesseract OCR iOS is a Framework for iOS8+, compiled also for armv7s and arm64. I have upgraded tesseract version to 4.1.1 release.
MIT License
45 stars 31 forks source link

Different ocr results with almost same image using lstm only single line mode #2

Open shishaozheng opened 5 years ago

shishaozheng commented 5 years ago

Hi, Thank you for update the ios lib to the tesseract 4.0.0 release version. I try to use it for almost same images with following language and mode (official fast eng tessdata):

let tesseract:G8Tesseract = G8Tesseract(language: "eng", engineMode: G8OCREngineMode.lstmOnly) tesseract.pageSegmentationMode = .singleLine

I try three times, but the result is quite different, just first one is correct, the other two results is so confused.

image

image

image

xwal commented 5 years ago

Hi, try to replace eng.traineddata with https://github.com/tesseract-ocr/tessdata/releases/tag/4.0.0, my repo is old version traineddata.

shishaozheng commented 5 years ago

Thank you for replying, actually I already use the latest official tessdata. I think the problem is related with the single line mode, because if I use the whole page as the input image instead of single line image which cropped with text line detection method implemented by myself, then change the engine mode from single line to block, the result seems good.