DayBreakZhang / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
1 stars 0 forks source link

Words concatenate #1472

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.Read a jpg file with text

What is the expected output? What do you see instead?
When there is a newline, the last word and the first word of the next line are 
supposed to be two separate words. But, the output file concatenates the last 
word and the first word as a single word

eg:

Source reads,

This report will be publicly visible 
So, don't include passwords or
other confidential information.

Output files has,

This report will be publicly visibleSo, don't include passwords orother 
confidential information.

What version of the product are you using? On what operating system?

version 3.02.02, Windows 7

Please provide any additional information below.

Original issue reported on code.google.com by rainy...@gmail.com on 8 May 2015 at 5:20

GoogleCodeExporter commented 9 years ago
Did you read FAQ?

https://code.google.com/p/tesseract-ocr/wiki/FAQ#What_output_formats_can_Tessera
ct_produce?

Original comment by zde...@gmail.com on 8 May 2015 at 8:15