r1me / TTesseractOCR4

Object Pascal binding for tesseract-ocr - an optical character recognition engine
MIT License
145 stars 46 forks source link

Applying a threshold in which OCR'ing should or should not be attempted #18

Closed tedsmith closed 3 years ago

tedsmith commented 3 years ago

Hi

Is there a way to set the confidence level? --min-conf 50, for example?

I am using the library well now, and it works great for many "obvious" files. But there are some that seem to be tricking or folling .RecognizeAsText; that causes an exception error.

I'd like to be able to "analyse" the image file first, before attempting the actual OCR, and then only do the OCR if there seems to be a reasonable assessment that text exists. For example, only if there is a 50% confidence level, attempt the OCR.

Any thoughts?

tedsmith commented 3 years ago

Errors were caused by me, not the API.