Closed saijaswanth433 closed 4 years ago
Newly trained models don't contain a dictionary and other parts from the original model. Those parts can be added by using combine_tessdata
.
Newly trained models don't contain a dictionary and other parts from the original model. Those parts can be added by using
combine_tessdata
.
can u please tell me how to use combine_tessdata and also will there be any increase in accuracy after i do it.
:~/mp/tesstrain-master/data/foo$ combine_tessdata /home/$USER/temp/eng. Combining tessdata files Error: traineddata file must contain at least (a unicharset fileand inttemp) OR an lstm file. Error combining tessdata files into /home/vishwam/temp/eng.traineddata Version string:4.1.0-rc1 23:version:size=9, offset=192
Please use the Tesseract user forum for all questions.
Dears, I'm using 5.0.0-alpha-20201224 and got the same problem when using combine_tessdata, ERROR: traineddata file must contain at least (a unicharset fileand inttemp) OR an lstm file I searched and can find nowhere an appropriate solution, after study I get the following solution.
SOLUTION AS BELOW, after you have got all necessary materials from command, e.g.
cntraining mytest.normal.exp0.tr
you should have the following 5 files
inttemp
normproto
pffmtable
shapetable
unicharset
rename them to
normal.inttemp
normal.normproto
normal.pffmtable
normal.shapetable
normal.unicharset
and them use the "combine_tessdata normal" again, you will get the final traineddata
normal.traineddata
--->output as below
Combining tessdata files
Output normal.traineddata created successfully.
Version string:5.0.0-alpha-20201224
1:unicharset:size=662, offset=192
3:inttemp:size=132152, offset=854
4:pffmtable:size=103, offset=133006
5:normproto:size=1262, offset=133109
13:shapetable:size=166, offset=134371
23:version:size=20, offset=134537
Hope this helps.
i trained with 140k data keeping tsseract best model(15mb size) as base model but after training when i generate tessdata_best , the model is created with 4.1 size. why is this happening?