axa-group / Parsr

Transforms PDF, Documents and Images into Enriched Structured Data
Apache License 2.0
5.85k stars 310 forks source link

Image rotation detection #144

Closed poveden closed 4 years ago

poveden commented 5 years ago

Is your feature request related to a problem? Please describe. Tesseract scanning quality diminishes on rotated/skewed images. Even mildly rotated images, although OCR'd correctly by Tesseract, can yield incorrect word ordering, specially on counter-clockwise rotations.

Describe the solution you'd like Add rotation correction before feeding the image to Tesseract.

Describe alternatives you've considered The above link actually does a pretty good job at offering options. Even dewarping options are explored there.

Additionally, I've found an article about Flattening curved documents in images, but I don't know if the solution is already covered above.

royjohal commented 4 years ago

Solved in #222