ocropus-archive / DUP-ocropy

Python-based tools for document analysis and OCR
Apache License 2.0
3.41k stars 590 forks source link

Find the orientation of the text in the document image #303

Closed rremani closed 5 years ago

rremani commented 6 years ago

Hi, Any idea to find the orientation of document image text in an image.

ChillarAnand commented 6 years ago

Here is a simple function for skew estimation https://github.com/tmbdev/ocropy/blob/d823ba4f4aaa6b85a19804e4569fe86a4be3f0d4/ocropus-nlbin#L71

zuphilip commented 6 years ago

The skew estimation can help with pages skewed by a few degrees. However, the text orientation is AFAIK always assumed to be left-to-right and top-to-bottom.

If you want to find the text-orientation algorithmically, you can try out to rotate by 90, 180, 270 degrees and mirror the image and OCR all these images. Then, it may be possible to see which result is most promising and therefore decide about the orientation. But there might be much better methods, maybe also depending on the scripts you are interested in.