AmitGorvadiya / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

inconsistent scanning through freeocr 3.0 (which uses tesseract - right?) #294

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. download and run freeocr 3.0, windows xp SP3
2. load a scanned jpeg, rotate clockwise, clip area to OCR
3. The area for OCR contained a list of names with corresponding email 
addresses. It looks like arial font. Admittedly scanned at a bit of a 
slant from the horizontal (maybe 3-5 degrees)

What is the expected output? What do you see instead?
- This was mostly straight text, with email domain name repeated from line 
to line
- The domain name translation was inconsistent, with ...
    - zero chosen instead of letter o on some lines, and not others.
    - letters l and i translated sometimes as pipe character - | (do you 
guys scan a lot of unix scripts??)
    - Letter M - well sometime this was l\/i, l\/l, lvl, or some such - 
sometimes correctly translated, but a lot of errors there.
    - letter k - sometimes came out as pipe+ less than, ie |<

What version of the product are you using? On what operating system?
See above

Please provide any additional information below.
Hope this helps, no time i'm afraid to check if this is an existing bug 
etc.
Sorry can't provide the original as it has personal info on it, but it was 
a pretty clean scan.

Original issue reported on code.google.com by vince.re...@gmail.com on 24 Apr 2010 at 5:40

GoogleCodeExporter commented 9 years ago
Known problem.

Original comment by theraysm...@gmail.com on 20 May 2010 at 1:05

GoogleCodeExporter commented 9 years ago

Original comment by theraysm...@gmail.com on 20 May 2010 at 1:12

GoogleCodeExporter commented 9 years ago
1. Freeocr is independent product, so please file issue to its authors. We do 
not know which version of tesseract it use and how. BTW: there were reports it 
communicates with strange server, so maybe it do more than OCR ;-) 
2. OCR result could be effect by input image quality - without providing image 
we can not help you (e.g. to test if issue is solved)

Original comment by zde...@gmail.com on 7 Dec 2013 at 11:51