deanmalmgren / textract

extract text from any document. no muss. no fuss.
http://textract.readthedocs.io
MIT License
3.89k stars 599 forks source link

Environment Variable is set but still it can't read from the tessdata directory #404

Open raza8899 opened 2 years ago

raza8899 commented 2 years ago

Describe the bug I am trying to use German language for text extraction from pdfs and hence copied 'deu.traineddata' into /opt/homebrew/share/tessdata, but still I receive the below error: b'Error opening data file /opt/homebrew/share/tessdata/deu.traineddata\nPlease make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory.\nFailed loading language \'deu\'\nTesseract couldn\'t load any languages!\nCould not initialize tesseract.\n'. Please let me know if there is a possible solution to it. Is there something like below in textract that can be used to set path explicitly: tesseract.setDatapath("/usr/share/tessdata/");

Desktop (please complete the following information):