nguyenq / tess4j

Java JNA wrapper for Tesseract OCR API
Apache License 2.0
1.58k stars 372 forks source link

Tesseract 5 Support #220

Closed mithunatri closed 2 years ago

mithunatri commented 2 years ago

Hi,

Any plans on supporting Tess4J for Tesseract 5? I currently get the following error when trying to start my application against Tesseract 5 --


Caused by: java.lang.UnsatisfiedLinkError: Error looking up function 'TessBaseAPIRecognizeForChopTest': /usr/lib/x86_64-linux-gnu/libtesseract.so.5.0.0: undefined symbol: TessBaseAPIRecognizeForChopTest
    at com.sun.jna.Function.<init>(Function.java:252)
    at com.sun.jna.NativeLibrary.getFunction(NativeLibrary.java:600)
    at com.sun.jna.NativeLibrary.getFunction(NativeLibrary.java:576)
    at com.sun.jna.NativeLibrary.getFunction(NativeLibrary.java:562)
    at com.sun.jna.Native.register(Native.java:1852)
    at com.sun.jna.Native.register(Native.java:1723)
    at com.sun.jna.Native.register(Native.java:1443)
    at net.sourceforge.tess4j.TessAPI1.<clinit>(TessAPI1.java:41)
nguyenq commented 2 years ago

Yes, it will be released shortly after Tesseract 5.0 release.

The master branch has code compatible with Tesseract 5.0. There's a 5.0.0-SNAPSHOT available as well.

mithunatri commented 2 years ago

Thank you! I was able to build it and run samples against Tesseract 5.0.

However, while building I get the following errors in the test phase (I have ignored these errors since I don't make use of those features):

Tests in error: 
  testTessBaseAPIGetUTF8Text_Pix(net.sourceforge.tess4j.TessAPI1Test): Error looking up function 'pixConvertRGBToGrayGeneral': /lib/x86_64-linux-gnu/liblept.so.5: undefined symbol: pixConvertRGBToGrayGeneral
  testTessBaseAPIGetComponentImages(net.sourceforge.tess4j.TessAPI1Test): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testTessBaseAPIAnalyseLayout(net.sourceforge.tess4j.TessAPI1Test): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testTessBaseAPIDetectOrientationScript(net.sourceforge.tess4j.TessAPI1Test): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testGetSegmentedRegions(net.sourceforge.tess4j.Tesseract1Test): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testCreateDocumentsWithResults1(net.sourceforge.tess4j.Tesseract1Test): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testCreateDocumentsWithResults1(net.sourceforge.tess4j.TesseractTest): Could not initialize class net.sourceforge.lept4j.Leptonica1
  testTessBaseAPIGetComponentImages(net.sourceforge.tess4j.TessAPITest): Could not initialize class net.sourceforge.lept4j.Leptonica1

The tesseract and leptonica versions are:

tesseract 5.0.0-beta-20210916-12-g19cc9
 leptonica-1.79.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found OpenMP 201511
 Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4

Running it on Ubuntu 20.04 with Java 11

nguyenq commented 2 years ago

Make sure you use the lept4j version compatible with your Leptonica version.

https://github.com/nguyenq/lept4j/releases

mithunatri commented 2 years ago

That did it. Thank you so much!

Closing this issue.