Closed kethan1 closed 4 years ago
Hi @kethan1 and thank you for reporting the issue. Can you share the Tesseract-OCR installer link that you used?
Also: you should use the full path to the executable tesseract binary -- example:
pytesseract.pytesseract.tesseract_cmd = r'C:/Users/ketha/AppData/Local/Tesseract-OCR/tesseract'
Hi, I have changed the path, and now I am getting a different error when on this line: print(pytesseract.image_to_string(Image.open('test-european.jpg'), lang='fra'))
. The full installer link is: https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w32-setup-v5.0.0-alpha.20200328.exe
The error I am getting is this:
Traceback (most recent call last):
File "C:\Users\ketha\Downloads\pytesseract-master\tests\data\test.py", line 17, in <module>
print(pytesseract.image_to_string(Image.open('test-european.jpg'), lang='fra'))
File "C:\Users\ketha\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py", line 356, in image_to_string
return {
File "C:\Users\ketha\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py", line 359, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "C:\Users\ketha\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py", line 270, in run_and_get_output
run_tesseract(**kwargs)
File "C:\Users\ketha\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py", line 246, in run_tesseract
raise TesseractError(proc.returncode, get_errors(error_string))
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Users\\ketha\\AppData\\Local\\Tesseract-OCR/tessdata/fra.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'fra\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
This one is not a pytesseract related problem. It seems that you didn't install the French tessdata language files together with Tesseract itself. The installer that you linked above, by default installs only English language data.
Can you give me the installer link for all the languages for Windows 10.
The installer that you use has this option as a list of languages for installing, so you can just check the box for the languages that you need.
Oh, okay, thank you so much for your help. Thanks for building the module!!
Hi, I am trying to run the sample code provided. I have installed tesseract from google. It is not in path, so I specified it in pytesseract.pytesseract.tesseract_cmd.
OS: Windows 10 Python: Python 3.8.3 Tesseract Installation Location: C:\Users\ketha\AppData\Local\Tesseract-OCR Error:
Code: