manisandro / gImageReader

A Gtk/Qt front-end to tesseract-ocr.
GNU General Public License v3.0
1.57k stars 187 forks source link

Feature request: localized inverted colors. #660

Open ebaldino opened 7 months ago

ebaldino commented 7 months ago

On practically every page I need to OCR, I must place the selection boxes manually. White text over a low-contrast background - most of the time - can only be recognized if the page's colors are inverted (and then it works fine). But that means I must manually place all selection boxes over that kind of content, then invert the colors, recognize, then invert the colors back and place and recognize the remaining of the content. Then I have to manually cut and paste to put the text in the right order.

This problem could be fixed if:: a. When a selection box is recognized and the result is gibberish, gImageReader could automatically invert the colors in that individual box and try again. b. If (a) is not possible, then allow the user to indicate that an individual selection box should be inverted for recognition.

Cheers, and thanks for an awesome product!