What steps will reproduce the problem?
1. Follow the bazaar tutorial
2. Test with simple image and pattern TEST/A/A/d/d/d
3. No filter at the result
What is the expected output? What do you see instead?
Expected : TESTAB123
See : TESTAB123
TESTABC12
TESTA1234
TEST12345
TESTABCD1
What version of the product are you using? On what operating system?
Tesseract 3
Windows 8
I want to read a specific character sequence with Tesseract wich contains the
word "TEST" followed by 2 characters and 3 digits.
I have tried bazaar matching pattern in Tesseract with the pattern
TEST\A\A\d\d\d
and ocr still recognize other words which doesn't match.
I have tried to use the "tessedit_char_whitelist" parameter but I can't choose
the position of the characters with that.
I launch the command : tesseract image.jpg result -l eng bazaar And I have no
error message, just :
"Tesseract Open Source OCR Engine v3.01 with Leptonica"
The result : TESTAB123 TESTABC12 TESTA1234 TEST12345 TESTABCD1
So it is wrong, I just wanted to catch the sequence "TESTAB123".
Can somebody tell me why the regular expression in my user-patterns file as no
effect ? For the configuration, I have STRICTLY followed the bazaar tutorial.
Original issue reported on code.google.com by leopold....@gmail.com on 10 Aug 2015 at 7:23
Original issue reported on code.google.com by
leopold....@gmail.com
on 10 Aug 2015 at 7:23Attachments: