zamazan4ik / PRLib

Pre-Recognition Library - library with algorithms for improving OCR quality.
MIT License
33 stars 14 forks source link

Improve autocrop algorithm #12

Open zamazan4ik opened 6 years ago

zamazan4ik commented 6 years ago

Implement algorithm based on Canny + houghline detection. Try to use binarizeByLocalVariances instead of Canny edge detector.

wrznr commented 4 years ago

Scantailor may also be of help here. In addition, you may want to have a look at: https://github.com/mjenckel/OCR-D-LAYoutERkennung/blob/master/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py. From my experience, their cropping works really good.