Closed GoogleCodeExporter closed 9 years ago
Ok I was having the same problem and I found it had to do with the file format.
My
image was a PDF produce from a scanned image. I used gimp to extract the
individual
pages from the pdf, and to extract the bits of text that I wanted to OCR. I
saved the
text as tiff (using gimp) and tesseract produced an empty text file. Gocr
works, but
with tons of mistakes. I tried converting the image to greyscale, changing the
size,
nothing worked.
So what I did next was save the image as a bmp (again using gimp), re-open it
and
save it as tiff, and it worked. The file size is 1/2 of the original tiff, but
I can
find no other differences listed in the file specs. So obviously there is some
setting that tesseract doesnt like.
Original comment by uglybudg...@gmail.com
on 3 Nov 2008 at 3:47
no comment
Original comment by simos.mo...@gmail.com
on 3 Nov 2008 at 3:56
There have been several bugs associated with various bit-depths with and
without
libtiff. These are all fixed in 2.04.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 3:51
Issue 135 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 5:55
Issue 133 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 6:01
Issue 113 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 6:35
Issue 102 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 6:54
Issue 96 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 7:27
Issue 95 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 14 Nov 2008 at 7:28
Issue 71 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 28 Dec 2008 at 7:29
Fixed in 2.04
Original comment by theraysm...@gmail.com
on 30 Dec 2008 at 6:35
Issue 61 has been merged into this issue.
Original comment by theraysm...@gmail.com
on 30 Dec 2008 at 9:39
Original issue reported on code.google.com by
pequn...@gmail.com
on 24 Oct 2008 at 3:45Attachments: