madmaze / pytesseract

A Python wrapper for Google Tesseract
Apache License 2.0
5.84k stars 721 forks source link

Should We Implement PSM (page segmentation mode) Into Tesseract as an Enum Option #441

Closed davidnaumann-bastian closed 2 years ago

davidnaumann-bastian commented 2 years ago

Instead of having this set via the config should this be an option that is set in the parameters of the pytesseract command as a sort of Enum variable?

Link to PSM documentation Tesseract documentation for PSMs:

0 Orientation and script detection (OSD) only. 1 Automatic page segmentation with OSD. 2 Automatic page segmentation, but no OSD, or OCR. 3 Fully automatic page segmentation, but no OSD. (Default) 4 Assume a single column of text of variable sizes. 5 Assume a single uniform block of vertically aligned text. 6 Assume a single uniform block of text. 7 Treat the image as a single text line. 8 Treat the image as a single word. 9 Treat the image as a single word in a circle. 10 Treat the image as a single character. 11 Sparse text. Find as much text as possible in no particular order. 12 Sparse text with OSD. 13 Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific.

bozhodimitrov commented 2 years ago

I'm not really sure about that -- the initial idea of pytesseract was to reuse the tesseract config. This avoids maintenance cost for updating any such mappings in the future if tesseract codebase changes.

Yes, we can add such feature, but in general it will be better if the users create such mappings for themselves. The Enum use case is pretty elegant, but will change the behavior for a lot of current users (unless we go through a deprecation route, which requires even more code for handling the old way).

Plus, the owner of this project seems to be busy with other things in life, so at the moment it is hard to even update the package.