tesseract-ocr / tesstrain

Train Tesseract LSTM with make
Apache License 2.0
630 stars 184 forks source link

Creating additional .traineddata files, (tessdata_best and tessdata_fast) gives couple of .traineddata files in each folder. Which one is best or which one is fast among them? #245

Closed saahiluppal closed 3 years ago

saahiluppal commented 3 years ago

creating additional .traineddata files after training with make traineddata creates two folders tessdata_best and tessdata_fast. Each folder contains couple of .traineddata files. For example tessdata_best contains

foo0.06_614.traineddata
foo0.135_613.traineddata
foo0.185_613.traineddata
foo0.205_613.traineddata
foo0.258_612.traineddata
foo0.305_606.traineddata
foo0.331_599.traineddata
foo0.376_598.traineddata
foo0.429_598.traineddata
foo0.506_597.traineddata
foo0.532_590.traineddata
foo0.564_579.traineddata
foo0.589_579.traineddata
foo0.809_577.traineddata
foo0.868_576.traineddata
foo0.933_572.traineddata
foo10.782_321.traineddata
foo1.131_568.traineddata
foo1.239_562.traineddata
foo1.327_559.traineddata
foo13.853_307.traineddata
foo1.478_546.traineddata
foo15.155_261.traineddata
foo1.568_542.traineddata
foo16.427_243.traineddata
foo17.691_214.traineddata
foo1.919_532.traineddata
foo19.303_196.traineddata
foo2.05_528.traineddata
foo21.782_169.traineddata
foo22.783_132.traineddata
foo2.309_525.traineddata
foo2.644_520.traineddata
foo26.6_96.traineddata
foo2.774_504.traineddata
foo3.079_500.traineddata
foo3.263_492.traineddata
foo3.647_472.traineddata
foo37.217_64.traineddata
foo4.435_465.traineddata
foo4.839_456.traineddata
foo5.058_444.traineddata
foo5.436_436.traineddata
foo6.333_428.traineddata
foo6.762_420.traineddata
foo7.235_407.traineddata
foo8.016_374.traineddata
foo9.484_365.traineddata
foo9.918_341.traineddata

Which one is the best among all of them?

wrznr commented 3 years ago

The number after the model name indicates the CER achieved on the test data. In your case foo0.06_614.traineddata is the best model with CER=0.06 %.