openpaperwork / pyocr

A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab
https://gitlab.gnome.org/World/OpenPaperwork/pyocr
931 stars 152 forks source link

Could we get a confidence value by each word? #74

Closed gbc8181 closed 6 years ago

gbc8181 commented 6 years ago

Hi everyone,

I just noticed that we can get a confidence value of each word by Tesseract API. So I want to know if pyocr can also give us a confidence value of each word? If so, how to do that?

Thanks so much!

jflesch commented 6 years ago

Are we talking about the command line tool tesseract or the library libtesseract ?

gbc8181 commented 6 years ago

I mean tesseract API. APIxample (https://github.com/tesseract-ocr/tesseract/wiki/APIExample) shows they have this function.

jflesch commented 6 years ago

Right now it can't. pyocr.libtesseract could be updated to add the support for that I guess.

jflesch commented 6 years ago

Implemented by @a-pagano : https://github.com/openpaperwork/pyocr/pull/86 :-)