nguyenq / tess4j

Java JNA wrapper for Tesseract OCR API
Apache License 2.0
1.58k stars 372 forks source link

OCRResult only contains last page's scan #233

Closed lynlevans closed 2 years ago

lynlevans commented 2 years ago

We've encountered a bug when calling createDocumentsWithResults() from Tesseract/tess4j 4.5.5. The Tiff scanned by the method call, has 32 pages, and ~3100 words. Yet, the result produced by the Java call only contains the result of the last page scanned.

The OCRResult, in Java, is an empty string in this bounding box: [ [Confidence: 95.000000 Bounding box: 313 434 938 822]], which is the same result when scanning the last page of the Tiff file.

Can the Tess4j team investigate this bug ?