brhpad.bmp(24bits) generated morelines(453) in text file- whereas brhpad.tif generated only 47 lines in textfile.

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. brhpad.tif generates only 47 lines in brhpad.txt
   log: "Tesseract Open Source OCR Engine
Image has 1 bit  per pixel and size (386,500)
Resolution=96"
2. brhpad.bmp generates 453 lines in brhpad.txt.
   log:"Tesseract Open Source OCR Engine" 
particulars of image does not generate - similar to one log generated for
.tif file.
3. summary of brhpad.bmp: image:24 bit /size: 386,500 / Resolution:96 dpi
frame count:1

What is the expected output? What do you see instead?
Lines generated in brhpad.txt of .tif should be more or equal to generated
lines in brhpad.txt of .bmp
Log of .bmp should generates full particulars of .bmp image similar to one
log generated for .tif image. 
Tsseract appears  support bmp more effectively than tif

What version of the product are you using? On what operating system?
tesseract2.0  XP

Please provide any additional information below.

Original issue reported on code.google.com by withbles...@gmail.com on 29 Aug 2007 at 12:09

Attachments:

GoogleCodeExporter commented 9 years ago

The images are different bit depths, so you should expect some difference - the
thresholding will not be the same. They should not be that different though, so 
this
looks like a bug...
BTW both of these images are too low resolution for decent recognition 
accuracy, and
so don't make good training pages either. These characters need to be 2-3x 
bigger.

Original comment by theraysm...@gmail.com on 6 Sep 2007 at 12:46

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

I think this is a duplicate of the now solved issue 160.

Original comment by theraysm...@gmail.com on 30 Dec 2008 at 9:39

Changed state: Duplicate

patcharats / tesseract-ocr

brhpad.bmp(24bits) generated morelines(453) in text file- whereas brhpad.tif generated only 47 lines in textfile. #61