tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
640 stars 190 forks source link

How to combine two trained data files : tam.traineddata and mal.traineddata #396

Open Amricreate opened 3 months ago

Amricreate commented 3 months ago

I am actually working on training Grantha script. What I figured out is that both Tamil and Malayalam have almost same characters and formatting as that of Grantha.

So, I assume if both models are used simutaneously, then the combined model can be used for finetuning using my Grantha data. I am not sure whether it will work. What is your opinion? Is this the only way like finetuning each model separately and then use this command:

"tesseract image.png -l grantha+grantha1"