Open robertknight opened 2 years ago
Tesseract's built-in orientation detection requires the library to be build with the legacy / non-LSTM text recognition engine. Leptonica has some built-in orientation detection functionality. So some options:
Work in progress at https://github.com/robertknight/tesseract-wasm/pull/34.
https://github.com/robertknight/tesseract-wasm/pull/34 adds a partial solution in the form of orientation detection, however the algorithm is simplistic and this means that in any application user input would probably be required to confirm actions depending on it.
I posted a comment on Hacker News and someone responded with a test case where the word recognition works well, but the text is not output in the correct order, due rotation of the image:
If you compare the text output of this image in the demo, vs a copy of this image rotated such that the text baselines are straight, you can see that the layout outputs are different.
Any updates on this?
No. Ensuring the input is correctly oriented is currently a problem that users of the library have to solve.
OCR completely fails if the image is rotated at 90, 180 or 270 degrees. Tesseract has built-in orientation detection, so this could be used to resolve that.