tesseract-ocr / tesseract

Tesseract Open Source OCR Engine (main repository)
https://tesseract-ocr.github.io/
Apache License 2.0
61.14k stars 9.39k forks source link

tessdata_prefix error with Fraktur.traineddata on High Sierra #1464

Closed ghost closed 6 years ago

ghost commented 6 years ago

I downloaded Fraktur.traineddata to /usr/local/share/tessdata

When using (I am new and following this guide: https://www.youtube.com/watch?v=QhJiOCwz-_I

$ tesseract file.tiff -l Fraktur Fraktur

The following error appears:

Error opening data file ./tessdata/Fraktur.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'Fraktur' Tesseract couldn't load any languages! Could not initialize tesseract.

The same happens when using eng.

Shreeshrii commented 6 years ago

tesseract file.tiff outputfile.tiff -l Fraktur --tessdata-dir /usr/local/share/tessdata

You need to give in following order

tesseract input output --options

––

Also check

tesseract -v tesseract --list-langs Tesseract --help

ghost commented 6 years ago

Thank you for your reply. In this case I won't follow the above mentioned guide. However, if I enter e.g. tesseract --list-langs the same error occurs:

Error opening data file ./tessdata/eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.

Shreeshrii commented 6 years ago

What is your tesseract version and o/s?

Did you try the command adding --tessdata-dir directly to it?

ghost commented 6 years ago

tesseract 3.05.01 macOS 10.13.3

When using tesseract file.tiff outputfile.tiff -l Fraktur --tessdata-dir /usr/local/share/tessdata it says Failed loading language 'Fraktur' Tesseract couldn't load any languages! Could not initialize tesseract.

Shreeshrii commented 6 years ago

3.01.01 is too old.

Please try to use the latest tesseract4.0.0beta.

On Tue 10 Apr, 2018, 5:27 PM Ni-2, notifications@github.com wrote:

tesseract 3.01.01 macOS 10.13.3

When using tesseract file.tiff outputfile.tiff -l Fraktur --tessdata-dir /usr/local/share/tessdata it says Failed loading language 'Fraktur' Tesseract couldn't load any languages! Could not initialize tesseract.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tesseract-ocr/tesseract/issues/1464#issuecomment-380072763, or mute the thread https://github.com/notifications/unsubscribe-auth/AE2_ow5OTxKtN3cyjEN2nkROZi2Y4mZ4ks5tnJ5CgaJpZM4TN-Ok .

ghost commented 6 years ago

3.05.01 is the version I use (typo), the one from mid-2017. Is this still too old? It is this version I get with the command brew install tesseract

amitdo commented 6 years ago

You are using a traineddata file that was trained for 4.0.0 and that is not compatible with 3.0x.

https://github.com/tesseract-ocr/tesseract/wiki/Data-Files

ghost commented 6 years ago

I see. I tried to install tesseract4.0 using this guide https://github.com/tesseract-ocr/tesseract/issues/1453 However, apparently I already fail running the code ln -hfs /usr/local/Cellar/icu4c/60.2 /usr/local/opt/icu4c since nothing happens. Since I haven't used terminal before I guess I just can't do it, tried it for the last 8 hours. I already tried abbyy finereader, but there the Fraktur ocr is useless.

Shreeshrii commented 6 years ago

You can try the GUI frontends to tesseract. I use gimagereader and vietocr. I am not sure whether they are available for your platform, but u can check their project pages for details.

https://github.com/manisandro/gImageReader/releases

https://sourceforge.net/projects/vietocr/files/vietocr/5.0alpha/

amitdo commented 6 years ago

@zdenop, please close this issue.