tesseract-ocr / langdata

Source training data for Tesseract for lots of languages
Apache License 2.0
826 stars 886 forks source link

Normalize unicode in texts #148

Closed stweil closed 4 years ago

stweil commented 4 years ago

Signed-off-by: Stefan Weil sw@weilnetz.de