Sicos1977 / TesseractOCR

A .net library to work with Google's Tesseract
169 stars 22 forks source link

Upgrade to Tesseract v5.3.0 and Leptonica 1.83.0 #28

Closed vsolominov closed 1 year ago

vsolominov commented 1 year ago

Tesseract - https://github.com/tesseract-ocr/tesseract/releases/tag/5.3.0 Leptonica - https://github.com/DanBloomberg/leptonica/releases/tag/1.83.0

Sicos1977 commented 1 year ago

I'll try to make a new nuget package today.

Sicos1977 commented 1 year ago

A new package is available on nuget

vsolominov commented 1 year ago

Thanks a lot!

But there are several problems (OS Windows):

  1. Now the application gives an error that the library could not be found (Marshal.GetLastWin32Error() returns 126). I tried clean solution with rebuild. If I put the leptonica-1.82.0.dll library in the x64 or x86 folder, then the error disappears.
  2. The project files contain data for copying previous libraries to the output directory and do not contain data for new libraries. Attached a patch (what I noticed). TesseractOCR-5.3.patch
  3. Tests don't pass.
Sicos1977 commented 1 year ago

I'll look into it today

Sicos1977 commented 1 year ago

It is failing because Tesseract is still using leptonica 1.82

Sicos1977 commented 1 year ago

I made a new nuget package with leptonica 1.82. I also made a tesseract 5.3 version with leptonica 1.83 but for some reason then all test with tiff files failes so I decided to not use that one.