Detection rate of 3.4.x inferior compared to previous version 3.3.1

manisandro / gImageReader

A Gtk/Qt front-end to tesseract-ocr.

GNU General Public License v3.0

1.57k stars 187 forks source link

Detection rate of 3.4.x inferior compared to previous version 3.3.1 #669

Closed Radulfur closed 1 month ago

Radulfur commented 3 months ago

I'm using gImageReader (precompiled Windows version) to recognize icelandic and german text in PDF documents. After changing from 3.3.1 to 3.4.0/2 detection of icelandic special characters is by far inferior to previous version. I assume that this is most probably due to the underlying tesseract-ocr engine. Would it be possible to change just the tesseract-ocr engine without loosing the improved user interface of the 3.4.x versions?

manisandro commented 3 months ago

Yes, but you will most likely need to recompile the application against the older libtesseract.