tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
599 stars 178 forks source link

How to optimize the size of traineddata #346

Closed Monster-2019 closed 1 year ago

Monster-2019 commented 1 year ago

I use lstmtraining to train some Arabic numerals in the game, and use eng's LSTM model for training, but my final traineddata does not need to recognize eng, I hope to remove the part of traineddata that recognizes eng to optimize the size of traineddata. What should I do?

stweil commented 1 year ago

A newly trained model does not contain language specific information like dictionaries. It only contains the neural network and the mapping for the unichars. Please use the Tesseract user forum for additional questions.