tesseract-ocr / langdata

Source training data for Tesseract for lots of languages
Apache License 2.0
827 stars 886 forks source link

List of fonts for training lang #98

Closed masztal closed 6 years ago

masztal commented 6 years ago

I want to reproduce the training exactly. Where can I obtain the list of fonts needed to train the language?

stweil commented 6 years ago

I'm afraid that it is currently not possible to reproduce the training. Not only the font list used for the different languages is missing, but also the texts used for the training.

Even with the font list it would be extremely difficult to get all fonts, as not all of them are freely available.

We have to wait until @theraysmith updates langdata. Maybe we'll know more then.

amitdo commented 6 years ago

86

masztal commented 6 years ago

thanks for info