inconsistent scanning through freeocr 3.0 (which uses tesseract - right?)

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
1. download and run freeocr 3.0, windows xp SP3
2. load a scanned jpeg, rotate clockwise, clip area to OCR
3. The area for OCR contained a list of names with corresponding email 
addresses. It looks like arial font. Admittedly scanned at a bit of a 
slant from the horizontal (maybe 3-5 degrees)

What is the expected output? What do you see instead?
- This was mostly straight text, with email domain name repeated from line 
to line
- The domain name translation was inconsistent, with ...
    - zero chosen instead of letter o on some lines, and not others.
    - letters l and i translated sometimes as pipe character - | (do you 
guys scan a lot of unix scripts??)
    - Letter M - well sometime this was l\/i, l\/l, lvl, or some such - 
sometimes correctly translated, but a lot of errors there.
    - letter k - sometimes came out as pipe+ less than, ie |<

What version of the product are you using? On what operating system?
See above

Please provide any additional information below.
Hope this helps, no time i'm afraid to check if this is an existing bug 
etc.
Sorry can't provide the original as it has personal info on it, but it was 
a pretty clean scan.

Original issue reported on code.google.com by vince.re...@gmail.com on 24 Apr 2010 at 5:40

GoogleCodeExporter commented 9 years ago

Known problem.

Original comment by theraysm...@gmail.com on 20 May 2010 at 1:05

Changed state: Started

GoogleCodeExporter commented 9 years ago

Original comment by theraysm...@gmail.com on 20 May 2010 at 1:12

Added labels: Priority-Medium, Type-Accuracy

GoogleCodeExporter commented 9 years ago

1. Freeocr is independent product, so please file issue to its authors. We do 
not know which version of tesseract it use and how. BTW: there were reports it 
communicates with strange server, so maybe it do more than OCR ;-) 
2. OCR result could be effect by input image quality - without providing image 
we can not help you (e.g. to test if issue is solved)

Original comment by zde...@gmail.com on 7 Dec 2013 at 11:51

Changed state: WontFix

AmitGorvadiya / tesseract-ocr

inconsistent scanning through freeocr 3.0 (which uses tesseract - right?) #294