tesseract-ocr / langdata

Source training data for Tesseract for lots of languages
Apache License 2.0
837 stars 888 forks source link

langdata

Source training data for Tesseract for lots of languages

Want to re-train tesseract for a specific language, by modifying/augmenting the original training data? Then you have come to the right place!

If you want to find a language data set to run Tesseract, then look at our tessdata repository instead.

To re-create the training of a single language, lang, you need the following: