dlareklami / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

The output is different for the common part of input, when used with different input sets #1140

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Execute the input for input file "test02.tif" with default settings.

What is the expected output? What do you see instead?
Input file contains:
Hello How are you doing ful 1 2 65 56 765 432 5 67 45 67 8 9 12 4
1 2 65 56 765 432 5 67 45 67 8 9 12 4

Output obtained:
HeIIoHowarevou doinglul 1 2 65 56 765 432 5 67 45 67 8 9 12 4
1265567654325 67 45 6789124

Please note that numeral string is same in both lines. In first line the 
numeral output is correct but it impacts the starting alphabets while in second 
line, the spacing is not correct. If I don't give the numeral string in first 
line, the alphabets are recognized correctly

What version of the product are you using? On what operating system?
3.02

Please provide any additional information below.
Default config.

Original issue reported on code.google.com by ashish3168 on 9 Apr 2014 at 11:38

Attachments: