tesseract-ocr / langdata

Source training data for Tesseract for lots of languages
Apache License 2.0
827 stars 886 forks source link

how to train text for multiple fonts? #92

Closed aijianiula0601 closed 6 years ago

aijianiula0601 commented 6 years ago

I want to train text for multiple fonts. Is that the only way to train times as flowing:

text2image --text=text_training.txt --outputbase=tb --font='Tahoma Bold' --fonts_dir=/Users/*/fonts ? Have a way to train with multiple fonts once? Thanks!

Shreeshrii commented 6 years ago

Follow the following format for using multiple fonts:

training/tesstrain.sh \ --fonts_dir /mnt/c/Windows/Fonts \ --lang mar \ --noextract_font_properties --linedata_only \ --exposures "0" \ --langdata_dir ../langdata \ --tessdata_dir ../tessdata \ --fontlist \ "Adobe Devanagari" \ "Arial Unicode MS" \ "Nakula" \ "Sahadeva" \ --output_dir ../tesstutorial/mar

ShreeDevi


भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Thu, Sep 21, 2017 at 12:01 PM, aijianiula0601 notifications@github.com wrote:

I want to train text for multiple fonts? Is that the only way to train times as flowing:

text2image --text=text_training.txt --outputbase=tb --font='Tahoma Bold' --fonts_dir=/Users/*/fonts

Have a way to train once with multiple fonts once ? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tesseract-ocr/langdata/issues/92, or mute the thread https://github.com/notifications/unsubscribe-auth/AE2_owUwDWGE0YvauP_yUTErdN7nbseCks5skgK5gaJpZM4Pe2yP .

Shreeshrii commented 6 years ago

@zdenop This issue can be closed.