konstantint / PassportEye

Extraction of machine-readable zone information from passports, visas and id-cards via OCR
MIT License
382 stars 110 forks source link

"Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract." #41

Closed Uysim closed 4 years ago

Uysim commented 4 years ago
mrz = read_mrz("/content/demo.png", extra_cmdline_params='--oem 0', save_roi=True)

I got the error message

Failed loading language 'eng' Tesseract couldn't load any languages! Could not initialize tesseract.

The problem is because --oem 0

konstantint commented 4 years ago

Yes, this stems from the fact that Tesseract often comes without legacy classifier data preinstalled, yet legacy classifiers tend to work better for MRZ recognition. Installing the appropriate data files or disabling the legacy mode (by specifying --oem 3 in the extra_cmdline_params) would resolve the issue. I recommend the former. See this comment also.