Open MathieuCliche opened 7 years ago
You mean one confidence score for the OCR on the whole image ? I'm not even sure whether Tesseract provides such score.
Yeah, for the whole image, or "per words". From what I read, tit's possible to get it from the hocr or tsv output. You can check it our here : https://github.com/tesseract-ocr/tesseract/wiki/Command-Line-Usage#tsv-output-currently-available-in-305-dev-in-master-branch-on-github
For example, the TSV output has a column "conf", which gives the confidence for each word.
Ok, good to know.
For the words, I guess it can be added as an attribute to pyocr.builders.Box
objects.
Regarding the whole, with the current API, it's going to be a little more complicated ...
Per words, you can say thanks to @a-pagano : https://github.com/openpaperwork/pyocr/pull/86 :-)
Sorry, I meant to keep this ticket opened regarding the confidence score for the whole page.
Changes of @a-pagano have been released in Pyocr 0.5
Is it possible to get a confidence score for the predictions (not orientation) ?