Closed kumarkrish85 closed 4 years ago
I did refer the issue https://github.com/RaiMan/SikuliX1/issues/195 and i am using tess4j version 3.5.2
1.1.4 has a new version of Tesseract and might behave quite a bit differently to 1.1.2.
You might want to play around with the page segmentation mode. Use the following code to adjust this:
TextRecognizer tr = TextRecognizer.start();
tr.setPSM(1);
Value 1 or 12 might give you better results.
Thanks @balmma , can yo please share the docs?
As you seem to try to detect mainly non dictionary words, deactivating the dictionaries might also help:
tr.setVariable("load_system_dawg", "false")
tr.setVariable("load_freq_dawg", "false")
The following is a great resource if you don't get the expected results from the OCR: https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality
Here you can find some info about OCR usage in SikuliX: https://sikulix-2014.readthedocs.io/en/latest/news.html#revision-of-the-text-and-ocr-feature
Thanks @balmma
I did try with
TextRecognizer tr = TextRecognizer.start();
tr.setPSM(1);
and updated tesseract OCR to 3.05 , but issue still exists. I have tried with different PSM values like 1,12 and 8.
Below snippet of code i used
rx.highlight(1);
TextRecognizer recog = TextRecognizer.start();
recog.setPSM(1);
recog.setVariable("load_system_dawg", "false");
recog.setVariable("load_freq_dawg", "false");
List<Match> matches = rx.collectWords();
Match match = matches.get(0);
String text = match.getText();
Expected words are DemoTestCase, allcommonaction and imgtextextracts, but we got the below output
Hi @RaiMan / @balmma please advice
Have you already tried with rx.text()?
Yes, i have tried with rx.text()
, please refer the image in first comment
Probably worth a try with the upcoming Tess4J 4.4.0.
@RaiMan Any estimates when the dev branch is going to be merged?
@kumarkrish85 What system you are working on?
@balmma the stuff is principally ready and will work for Windows out of the box. I have to add something to the docs for macOS (Tesseract has to be installed with homebrew or Macports) and for Linux systems (did not make any tests, but seems to be much easier to get a working Tesseract 4 than with version 3). So it might get Monday until it is online officially.
@RaiMan
I am working on Windows 10. 8GB RAM , Visual C++ Redistributable Pack (>=2013) installed. Just to give you background on how i have updated the tessdata , just downloaded the tessdata from https://github.com/tesseract-ocr/tessdata/raw/3.04.00/eng.traineddata and copied in user directory.
@kumarkrish85 Ok, I can prepare a fat (includes all dependencies) sikulixapi.jar for you, with the Tess4J 4.4.0/Tesseract 4.1.0 complete with eng.traineddata to download from my Dropbox.
Please tell, if this would be suitable for you.
@RaiMan , yes please share the link to download
ok, I will do it asap.
Thanks @RaiMan
on OneDrive: https://1drv.ms/u/s!Ahzz_Daw4EefhUa1PI4i1XrKw4v3?e=uk48eh
Please test with Region.text() first. If you want to play around with any Tesseract settings, please look at the Tesseract 4.1.0 docs.
Please tell me, when you have successfully downloaded, so I can delete the file again.
Feedback about your tests is highly appreciated.
@RaiMan I can't download from organization environment. I will try in my laptop and update
@kumarkrish85 Sorry for the inconvenience, but currently no other fast option.
@RaiMan no issues :) , I have tested with your fat jar and tesseract. It worked successfully. But for few words like allcommonaction it fetches the text as allcommenactions from region. I did try different operations like fetch the text at right and left of the region. It worked fine. Appreciate your timely help and Thanks
Do you have ClearType active on your system? Turning it off also helps sometimes.
Sure I will check and revert back. Thanks
Oh, I'm not really a Windows expert. But I would say it must be somewhere in display settings.
Edit: Before an edit the previous comment was:
How and where to check the configuration?
That's why my answer doesn't really make sense :-)
Yes you definitely have ClearType enabled on your system (it's not your fault, it's enabled by default and usually desirable). You can clearly see it when you open your screenshot in e.g. Gimp and zoom in (does usually not work in image viewers because they interpolate when zoomed in): You see those orange, yellow and blue artifacts? Those are the rendered subpixels. I'm pretty sure that disabling ClearType will give you much better results.
@RaiMan I'll try to figure out how we can improve reliability of Tesseract in such cases. Some very basic tests indicate that applying a modest Gaussian Blur with a 0.5 px radius before the resizing dramatically decreases the error rate. But I have to verify this with much more samples :-)
Thanks for the detailed solution. When these changes will be pushed to master branch.
For me it would also be interesting whether or not disabling ClearType helps in your case.
By Disabling ClearType we are able to fetch the text clearly. Refer the image below
Can i capture about disabling ClearType in my tool OCR documentation?
Can i capture about disabling ClearType in my tool OCR documentation?
Sorry, don't understand the question :-). Can you rephrase please?
I mean :) this configuration can i add it in my tool documentation ? (or) any plans for addressing this in code.
Now I got it :-)
We always disable ClearType on all machines we run SikuliX scripts on. It can also cause problems with finding images (using the find() method, of course only if the image contains text) because the sub pixels are somewhat unpredictable and can even change from screenshot to screenshot of the same screen. It's quite hard to tackle this properly in SikuliX.
Saying that, you can save yourself a lot of trouble just disabling it if applicable.
@RaiMan Probably we should also add this in the SikuliX documentation. We had massive problems with not found images before we figured this out.
@RaiMan Did some tests with more samples. Blurring helps in some cases but makes it even worse in others. By far the greatest positive effect has disabling ClearType on the OS level.
@balmma Thanks for the evaluation and comments.
Switch off ClearType: Probably we should also add this in the SikuliX documentation.
agreed. will do so.
Optimise image before giving it to Tesseract
As far as I have understood the Tesseract docs, the only must is to hand over an image with a resolution between 300 and 400 Dpi (which is done in SikuliX). Please give me the blur-code - at least we can add it as an option, that can be tried in case.
@RaiMan Main problem is, that Tesseract is optimized to recognize scanned documents and not artificially optimized text for LCD flat panel monitors :-) What we are doing is just to upscale all those ClearType artifacts and our screenshots are ending up something like this: If we wan't to do something we have to get rid of those artifacts first. Either by disabling ClearType or by some clever preprocessing. I have an idea in mind to achieve this, but need some time to do some more experiments :-)
@balmma understood and agreed.
side note: I have started a branch dev-opencv-4, where I upgrade to OpenCV 4.1.1. I will check wether it is possible to get the stuff also with homebrew to macOS, which would get the jar size down by about 30 MB.
@kumarkrish85 Did you make your final successful tests with the 1.1.4-jar I prepared for you (Tesseract 4) or with the latest available build (Tesseract 3)?
@RaiMan I have used fat jar which you shared (Tesseract 4).
@RaiMan
size down by about 30 MB.
Sounds amazing :-)
@kumarkrish85 @balmma I will close this issue, when I have upgraded the official build to Tesseract 4 (will do asap).
Thanks @RaiMan
Hi @RaiMan , When the changes are planned to push to master? The next patch release of my tool depends on this update.
@kumarkrish85 Sorry for the delay, but I had a hard time the last days, to clarify the situation on Linux (Ubuntu 18.04). I am now through with it and will trigger a new build somewhen tomorrow containing the latest Tesseract 4 and this additional enhancement (#200)
Thanks for the update @RaiMan
please try with latest build #382 from today. Should work.
tr.setVariable("load_system_dawg", "false") tr.setVariable("load_freq_dawg", "false")
I've just figured out that this doesn't work. Those two can only be used in the init function of Tesseract, means that they have to be specified in a config (https://github.com/tesseract-ocr/tesseract/wiki/ControlParams).
To get it working you have to perform the following steps:
Place a file called nodict
in appdata/SikulixTesseract/tessdata/configs
(appdata
is ~/.Sikulix
on Linux, C:\Users\<user>\AppData\Roaming\Sikulix
on windows) with the following content:
load_system_dawg F
load_freq_dawg F
Use it from Sikulix:
TextRecognizer recog = TextRecognizer.start();
recog.setPSM(12);
recog.setConfigs(Arrays.asList(new String[] {"nodict"}));
@RaiMan Might be worth to add this to the docs.
@RaiMan And probably the digit config from #73 as well.
The OCR docs have to be revised anyways. On my list now.
I am using Sikuli 1.1.4_SNAPSHOT Sikuli XAPI jar in my tool. Earlier i was using 1.1.2 jar. Image text was fetched fine when using 1.1.2 jar. In latest jar it fetches the text but some extra characters are added in it.
I am trying to fetch the text (DemoTestCase) in the below image
I use the code
https://github.com/CognizantQAHub/Cognizant-Intelligent-Test-Scripter/blob/master/Engine/src/main/java/com/cognizant/cognizantits/engine/commands/image/Text.java#L115
find target code
https://github.com/CognizantQAHub/Cognizant-Intelligent-Test-Scripter/blob/master/Engine/src/main/java/com/cognizant/cognizantits/engine/commands/image/ImageCommand.java#L268
I got the below output
instead of "DemoTestCase" it fetches some extra characters are get added like EI DemoTestCase. Please share your inputs to resolve the issue.
when i try to fetch allcommonaction text from the image (refer the image above) it returns