Closed Larbo53 closed 4 months ago
This seems to be a Tesseract issue, not a pytesseract one.
Please verify that Tesseract can find any language files using tesseract --list-langs
from the terminal. If this does not yield any languages, please install the language files the same way you installed Tesseract (using the same source usually ensures that the hard-coded data directory is valid) or download them manually and use the TESSDATA_PREFIX
environment variable to point to them.
I've just reinstalled tesseract with the pip command and the problem persists. how do i find and install the language file, and in which directory should it be stored? I'm using python3.9 and macos Monterey v 12.75. Thanks for your help. Sincerely
pytesseract is just a wrapper around Tesseract, which needs to be installed separately. Please refer to the Tesseract project for further installation instructions: https://github.com/tesseract-ocr/tesseract?tab=readme-ov-file#installing-tesseract
I just found the language files in 'usr/local/bin/tesseract-lang/4.1.0/share/tessdata'. In which file do I have to enter this path for it to work properly? Thanks for your help. Sincerely
In the best case, your Tesseract installation already picks this up. Otherwise, you have to set the environment variable TESSDATA_PREFIX
accordingly - either in your global environment or inside your Python script with os.environ["TESSDATA_PREFIX"] = ...
.
I've just seen that the Tesseract version is 5.2. Maybe that's where the problem lies. Thank you.
os.environ["TESSDATA_PREFIX"] = 'usr/local/bin/tesseract-lang/4.1.0/share/tessdata/' error message : Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
In this case it seems like the English language data is not available, which AFAIK is always required.
I uninstall tesseract, then reinstall it. Thank you
hi,
by reinstalling everything, tesseract is now operational. Thanks a lot for your help. Best regards.
Hi,
the print(pytesseract.get_languages(config='')) command returns an empty list. I use python3.9 I deleted pytesseract, restarted my macbook, then reinstalled pytesseract. I still have the same problem. How can I install the list of languages? Thanks for your feedback. Sincerely