Closed Skwol closed 5 years ago
As a side question. Is there a way I can use previous version without messing with m2 dir? In other words is there a way to specify a build in maven?
Can confirm that OCR is not working at all in the current snapshot on Windows (not tested on other OSes yet). Will have a look at it.
The bug was introduced with Commit a25c467d8d68a7525815cdffcfa929ce2e296b20 where Tess4J got updated from 3.5.2 to 3.5.3. The changelog (http://tess4j.sourceforge.net/changelog.html) says that they fixed a compatibility issue with JDK9's ByteBuffer.flip() method. Might cause another bug in Java 12?
Setting it back to 5.3.2 seems to fix the issue.
BUT: Upgrading the version to 4.4.0 also seems to work flawlessly. This issue is probably a good opportunity to upgrade to the latest Tess4J version :-)
Thanks for a quick response. Totally forgot to mention that I'm using OS X, though it feels like it doesn't matter in this case.
@Skwol Thanks @balmma for finding the possible regression point. I am testing on Mac, where Region.text() returns nothing also with latest build (Java 8 and Java 12) I will try with Tess4J 3.5.2 and if that works, go back in the first step. Then I will try with Tess4J 4 (the challenge here are the native libs for Mac and Linux, which might have to be revised also).
@RaiMan Did some digging in the Tess4J source code and found the offending change (https://github.com/nguyenq/tess4j/commit/ba1d5fd3d62a44d6c4949f47c1f991c9f7143aa7). The new if statement does not really make sense IMHO. For color images we get a DataBufferInt, for grey ones we get DataBufferByte. For DataBufferInt we have to get the pixel size, for data BufferByte we can set this to 8.
Best would be from our side to convert the image to greyscale before passing it to Tess4J. Will create a PR for this shortly.
And I have to figure this out with the Tess4J guys.
Tested with Tess4J 3.5.2 - works. A new build and a new snapshot with the fix (going back to 3.5.2) are now available
PR still relevant?
IMO not needed in the moment. I will now try Tess4J 4
Even better :-)
Used 1.1.4-SNAPSHOT for several months. OCR worked smoothly in IDE and in Java project. On 15th of September I've updated Java project using maven
New version seems to be 1.1.4-SNAPSHOT 20190913.083811 After that any OCR usage returns empty string. Simple examples I'm trying:
I've downloaded fresh IDE version from a website (build#: 380 2019-09-13_08:35) and tried:
Every time it's just an empty strings. Am I missing something? Or is something changed and I need to find some other way to use it? Build #: 299 (2019-08-08_14:05) seems to work fine with same code.