Closed GoogleCodeExporter closed 9 years ago
Seemed to have hit enter before I wrote anything, here's the issue:
Training a font for numbers. The resulting box file is throwing errors after
modifying it.
What steps will reproduce the problem?
1. Modifying the box file after running the tesseract ... ... ... box.train
2.
3.
What is the expected output? What do you see instead?
Expected a box file with no error outputs
What version of the product are you using? On what operating system?
3.0 on Windows 7
Please provide any additional information below.
The idea is to train tesseract to recognize numbers in various images, and to
output them into a string. When training a new font of phone numbers, I get an
error output after I modify the box file. It's worked for two other fonts, so
I'm confused why it would stop. Attached are the tif file, the original box
file, and the modified one.
Original comment by Vehix...@gmail.com
on 12 Aug 2011 at 8:34
Attachments:
1. it work for me with tesseract 3.01:
tesseract eng.Lasha.samp.tif eng.Lasha.samp box.train (or)
tesseract eng.Lasha.samp.tif eng.Lasha.samptemp box.train
give this output:
Tesseract Open Source OCR Engine v3.01 with Leptonica
Page 0
APPLY_BOXES:
Boxes read from boxfile: 11
Boxes failed resegmentation: 0
Found 11 good blobs and 0 unlabelled blobs in 0 words.
0 remaining unlabelled words deleted.
TRAINING ... Font name = Lasha
Generated training data for 4 words
=> no error
2. You are not following instruction:
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Generate_Training
_Images
Original comment by zde...@gmail.com
on 19 Apr 2012 at 7:12
Original issue reported on code.google.com by
Vehix...@gmail.com
on 12 Aug 2011 at 8:27