jacklicn / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

hOCR output does not account for alternate orientations #553

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Run tesseract with the -psm 1 hocr  arguments on an image with vertical text
2.
3.

What is the expected output? What do you see instead?
The bounding boxes in the output are still in tesseract's internal rotated 
coordinate system, and have no bearing on the actual source image.  The box 
output gets the bounding information correct.

What version of the product are you using? On what operating system?
Tesseract 3.01 on MacOS X 10.7

Please provide any additional information below.
It appears as though the box.rotate() method is not called on the bounding 
boxes in the hOCR code prior to the values being printed in the output.  I'm 
attaching a diff which seems to fix the problem, although I'm not really a C++ 
developer and I probably made a mistake or two in there. 

Original issue reported on code.google.com by clindb...@gmail.com on 28 Sep 2011 at 8:44

Attachments:

GoogleCodeExporter commented 9 years ago
Fixed in 3.02.  Please verify it works for you as well.

Original comment by david.e...@gmail.com on 21 Feb 2012 at 7:11