What steps will reproduce the problem?
Trying to use the code that makes a whitelist for Tesseract like follows
ocr = tesseract.TessBaseAPI()
ocr.SetVariable("tessedit_char_whitelist", "0123456789;")
ocr.SetPageSegMode(tesseract.PSM_AUTO)
ocr.Init("C:\\Program Files (x86)\\Tesseract-OCR\\","eng",tesseract.OEM_DEFAULT)
What is the expected output? What do you see instead?
Intended output is to have only "0123456789;" characters be recognized when
using the image_to_string() function. Using code like what is above,
image_to_string() just ignores it and grabs whatever characters it finds.
What version of the product are you using? On what operating system?
pytesseract-0.1, Python 2.7, Windows 8.1
Please provide any additional information below.
I've been trying everything people use for Tesseract-OCR, but that doesn't work
with pytesseract. I haven't been able to find any solution or method to
whitelisting with the image_to_string() function anywhere, which would be
immensely helpful in improving the accuracy of the function.
Thanks in advance for any help on the matter.
Original issue reported on code.google.com by darke...@yahoo.com on 9 Jun 2015 at 6:58
Original issue reported on code.google.com by
darke...@yahoo.com
on 9 Jun 2015 at 6:58