jacklicn / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

training in tesseract 3.01 (and current svn version) fails #578

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

I am following the steps in 
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3 to train for a 
new font (f1).

1. tesseract test.f1.exp0.tif test.f1.exp0 nobatch box.train
2. tesseract test.f1.exp1.tif test.f1.exp1 nobatch box.train
3. Edit and fix the box files
4. unicharset_extractor test.f1.exp0.box test.f1.exp1.box
5. echo "f1 0 0 0 1 0" >f1
6. run "mftraining -F f1 -U unicharset test.f1.exp0.tr test.f1.exp1.tr"

(not that the documentation on calls to mftraining and cntraining is not clear. 
Do I call mf/cntraining to all fonts of this language, or on every single font 
only sequentially?)

What is the expected output? What do you see instead?

The mftraining fails with the following message:
Reading test.f1.exp0.tr ...
Reading test.f1.exp1.tr ...
Class->NumConfigs == 
this->fontset_table_.get(Class->font_set_id).size:Error:Assert failed:in file 
intproto.cpp, line 1312

What version of the product are you using? On what operating system?
I'm using the 3.01 and the current svn version on ubuntu linux.

Original issue reported on code.google.com by n.mavrog...@gmail.com on 18 Nov 2011 at 11:06

GoogleCodeExporter commented 9 years ago

Original comment by zde...@gmail.com on 18 Nov 2011 at 4:43