Closed j0rdanit0 closed 6 years ago
Does the BasicExample still work well?
Same error message. I tweaked the paths to fit my environment:
BytePointer outText;
tesseract.TessBaseAPI api = new tesseract.TessBaseAPI();
// Initialize tesseract-ocr with English, without specifying tessdata path
if (api.Init("C:\\dev\\swgoh-service\\tessdata", "eng") != 0) {
System.err.println("Could not initialize tesseract.");
System.exit(1);
}
// Open input image with leptonica library
lept.PIX image = pixRead( args.length > 0 ? args[0] : "C:\\Users\\JordanS\\Desktop\\ImageMatchingTest\\munky.png");
api.SetImage(image);
// Get OCR result
outText = api.GetUTF8Text();
System.out.println("OCR output:\n" + outText.getString());
// Destroy used object and release memory
api.End();
outText.deallocate();
pixDestroy(image);
Log file contents:
Connected to the target VM, address: '127.0.0.1:51990', transport: 'socket'
read_params_file: parameter not found: enable_new_segsearch
Disconnected from the target VM, address: '127.0.0.1:51990', transport: 'socket'
Process finished with exit code 1
UPDATE:
I started thinking about your comment in my other ticket. It's true that I downloaded all of my .traineddata files from the up-to-date list.. except for eng
. I remembered that that one in particular came with my initial download of tesseract, not from the list. I updated it and the basic example works. Now, I'm getting this error message in my website code:
Error setting param load_fixed_length_dawgs
Looks like "load_fixed_length_dawgs" got removed, so you can remove it from your application as well: https://github.com/tesseract-ocr/tesseract/commit/18c8f8833f8f5a771c84ed4aba0ba3150964583d
Awesome, that fixed it. As always, thanks for your help. One last thing: I'm getting this warning over and over:
Warning. Invalid resolution 0 dpi. Using 70 instead.
Even with the warning, I'm getting results that are working pretty well, but maybe it would be working better if I were to fix the issue? Not sure what the warning is implying. I am feeding tesseract a BufferedImage, which represents a cropped portion of an image that I receive via http in the form of a MultipartFile object. I'm not explicitly removing any DPI settings.. do you know why they would be missing? Also, I added code from this stackoverflow article to add DPI settings to the BufferedImage, but that didn't change the result. Do you have any suggestions or insight to alleviate this?
It wants DPI settings in the PIX image. JavaCV isn't converting it, but we can easily enough set it, I'm guessing with this: http://bytedeco.org/javacpp-presets/leptonica/apidocs/org/bytedeco/javacpp/lept.html#pixSetResolution-org.bytedeco.javacpp.lept.PIX-int-int-
Excellent, that was it. Thanks so much!
New error when performing Maven install:
Failed to execute goal on project swgoh-service: Could not resolve dependencies for project com.jordan:swgoh-service:jar:3.0.2: Failure to find org.bytedeco.javacpp-presets:leptonica:jar:windows-x86:1.76.0-1.4.2-20180611.143021-191 in http://repository.jboss.org/nexus/content/groups/public/ was cached in the local repository, resolution will not be reattempted until the update interval of jboss-public-repository-group has elapsed or updates are forced
<dependency>
<groupId>org.bytedeco.javacpp-presets</groupId>
<artifactId>tesseract-platform</artifactId>
<version>4.0.0-beta.3-1.4.2-SNAPSHOT</version>
</dependency>
Make sure to use "mvn -U ..." to update your cache.
facepalm yeah, of course.. thanks, we're all good here lol
And 1.4.2 has now been released, so make to use 4.0.0-beta.3-1.4.2
!
After switching to the new beta.3 version of tesseract, I am seeing some issues that were not happening when I was using beta.1.
Windows: calling the Init() method does not return 0. I'm not sure what is wrong since there is no error message. Ubuntu: calling the Init() method causes the Java process to be killed, with this error in the logs:
pom.xml:
helper method to initialize the TessBaseAPI:
I'm also not exactly sure how all these parameters work - I've got some being defined for the Init() method, via StringGenericVector objects, and I've got others being defined after the Init() method via the SetVariable() method. None of them include the parameter that is listed in the error message:
enable_new_segsearch