sonurakpinar / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Multiple languages fail with Arabic. #1220

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
This is using the latest language data and code from SVN.

'-l ara+eng' produces only arabic output. '-l eng+ara' produces only english 
output.

I would guess this is related to arabic using cube, but I haven't looked into 
why this is yet. Multiple languages not using cube ('grc+eng' being the one I 
tested) work fine.

Attached is a test image, and output from ara+eng and eng+ara.

Original issue reported on code.google.com by nick.wh...@durham.ac.uk on 27 May 2014 at 8:28

Attachments:

GoogleCodeExporter commented 9 years ago
Also this should probably not be investigated until Ray uploads the new 
training data, which is expected soon, as it may be that it's a bug that won't 
be encountered anymore.

Original comment by nick.wh...@durham.ac.uk on 27 May 2014 at 8:36

GoogleCodeExporter commented 9 years ago

Original comment by nick.wh...@durham.ac.uk on 27 May 2014 at 8:39

GoogleCodeExporter commented 9 years ago
Fixed by change 2f197cd6537b.

Original comment by theraysm...@gmail.com on 7 Oct 2014 at 4:02