paalberti / tesseract-dan-fraktur

Tesseract ocr training data for Danish written in fraktur script and a few other languages
Other
17 stars 9 forks source link

box/png pairs with alphabet in different typefaces #4

Open Shreeshrii opened 7 years ago

Shreeshrii commented 7 years ago

Attached is a zip file with box/png pairs with alphabet in different Fraktur typefaces.

The box files need to be corrected.

fraktur-png-box-to-be-corrected.zip

Shreeshrii commented 7 years ago

Attached is a zip file with box/tiff pairs generated for deu_frak using text2image program.

These can be used with the 3.0x version of tesseract along with box/tiff pairs provided in this repo.

frk.box-tif-pairs.zip

LSTM (4.0) training needs much larger training data.