Closed wanghaisheng closed 7 years ago
I haven't checked yet Tesseract 4. Are they any notable difference with Tesseract 3 from an usage point of view ? oO
AFAIK it has 2+ years of additional development and I hope this results in better OCR results.
I have opened a pull request to at least allow to run it without throwing an exception in https://github.com/openpaperwork/pyocr/pull/66. It seems to be working fine with paperless from what I can tell.
Hi @jflesch, I don't think this issue should be closed yet, no? #66 let's pyocr run with tesseract 4.0 but at least I did not check whether there were any compatibility issues.
Yep, I just tested, and with all the languages installed + Tesseract 4, the libtesseract support segfaults
Nevermind. I was stil working with Libtesseract 3. I'm adding support for libtesseract 4.
291624d464e56048ac77e41312fc0bc3265bdb31
Included in Pyocr 0.4.7
Thumbs up @jflesch! That was very quick indeed. :)
I have to correct myself though:
AFAIK it has 2+ years of additional development and I hope this results in better OCR results.
Only the version in the package manager on my oldish ubuntu is years old. The last release was only in February [0] so it may not be that big a difference. Still it is good to be able to just compile from the master branch and use tesseract with pyocr.
[0]
The latest stable version is 3.05.00, released in February 2017.
is there any plan support Tesseract 4.0 alpha