Closed GoogleCodeExporter closed 9 years ago
I have not answers for all issues, but here are some findings:
First of all - input image: use at least 300 dpi resolution and low number of
colors (I prefare 16 gray colors or just 2 color). You can get this way smaller
file (see attachment hiero.egyptianhiero.exp2.png)
Next: I removed other of pages (see hiero.egyptianhiero.exp2.box) than number
of errors decreased ;-)
I tried to train it in tesseract 3.01 (it is in svn) and I got "better" log
output - see tesseract.log. If you visualize it (see
box-hiero.egyptianhiero.exp2.png: pink rectangles are boxes from box file, blue
are errors "FAILURE! Couldn't find a matching blob" and green are "Unlabelled
word at :Bounding box") than it looks like tesseract is not happy because of
missing boxes (Unlabelled word at :Bounding box).
For "FAILURE! Couldn't find a matching blob" or "FAILURE! box overlaps no blobs
or blobs in multiple rows" I have no suggestion for the moment...
Original comment by zde...@gmail.com
on 6 Apr 2011 at 2:47
Attachments:
I am merging this issue to issue 430. 3.02 report real problem ("Unlabelled
word") + 4 "FAILURE! Couldn't find a matching blob" (I am not sure about
reason, but input image did not fulfill requirement from
http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3#Generate_Training
_Images)
Original comment by zde...@gmail.com
on 24 Jul 2012 at 7:52
Original issue reported on code.google.com by
Oduss...@gmail.com
on 6 Apr 2011 at 11:50Attachments: