danpla / dpscreenocr

Program to recognize text on screen
https://danpla.github.io/dpscreenocr/
zlib License
231 stars 17 forks source link

Negative (invert image) #10

Closed DJuego closed 4 years ago

DJuego commented 4 years ago

Congratulations on such a promising tool! Thanks for the effort!

It seems that, sometimes, dark text/light background works better than light text/dark background (or viceversa) depending on the specific sample.

It seems interesting that there is an option to "invert" the image clip (negative) before sending it to Tesseract (checkbox or so). Is it possible?

DJuego

danpla commented 4 years ago

I'd rather avoid adding new options to the interface, unless a feature is 100% useful.

Because of how Tesseract's algorithm works, small changes in image may lead to dramatically different OCR results. It's so unpredictable that an option to invert image would be nearly useless in practice. If the recognized text is 50% garbage, inverting an image is unlikely to make a big enough difference to spend time toggling a checkbox and make OCR again, and you don't even know whether there will be an improvement or degradation.

In my experience, in most cases Tesseract is surprisingly good at guessing the best result, regardless of whether an image has black text on white background or vice versa, and most of the errors happen due to uncommonly looking fonts.

DJuego commented 4 years ago

Thank you for your detailed answer. I understand your arguments. And I have to admit that I find them quite reasonable. I'm satisfied!

DJuego

danpla commented 1 year ago

FYI. It turns out that Tesseract already does inversion under the hood if recognizing the original image doesn't give a good enough result.

There's even the invert_threshold option to control this behavior:

https://github.com/tesseract-ocr/tesseract/commit/96861b58aebd4cf5d0c3aa517ad98541db8a3f50