tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
599 stars 178 forks source link

Question - [Generating Traindata] #293

Closed mohsenomidi closed 2 years ago

mohsenomidi commented 2 years ago

Hi to everybody,

In the old versions I could use the old bash script called tesstrain.sh that generates the train data with custom fonts, but in this change i can not find a way how to achieve this, is there any similar command exist in new training script to use custom fonts?

here is my old command:

tesstrain.sh --fonts_dir fonts \
        --fontlist 'My Specific font name' \
        --lang eng \
        --linedata_only \
        --langdata_dir langdata_lstm \
        --tessdata_dir tesseract/tessdata \
        --save_box_tiff \
        --maxpages 100 \
        --output_dir train
Shreeshrii commented 2 years ago

Use python version of script at

https://github.com/tesseract-ocr/tesstrain/blob/main/src/training/tesstrain.py

mohsenomidi commented 2 years ago

Thanks 🚀 🎉