arturaugusto / display_ocr

Real-time image preprocess and OCR.
GNU General Public License v2.0
265 stars 65 forks source link

Update to python 3? #4

Open PiOliver opened 8 years ago

PiOliver commented 8 years ago

Hey, Is possible for this to be updated to work in python 3? I've had a play with the getting it to work but it seems that the wrapper for tesseracts API is broken.

Thanks.

arturaugusto commented 8 years ago

Would be nice. Some time ago I did some tests but didn't had time to go longer and get something that works. Had some problems with newer versions of OpenCV too. I Will take a look when I have some time.

sphinxligustri commented 8 years ago

Hi, here's my thoughts on the matter.

Pyocr provides, image_to_string, a method for extracting text from an image. In my case, I had to add a couple of lines to the trainer script (tesseract-trainer.py) in order to get access to the trained model. command7 = 'cp -f '+fontname+'.traineddata /usr/local/share/tessdata/' os.system(command7)

After that my implementatiion is something like the following: from pyocr import pyocr from pyocr import builders from PIL import Image import numpy as np ... tool = pyocr.get_available_tools()[0] ###help(tool.image_to_string) ###image_to_string(image, lang=None, builder=None) ... ### tool.get_available_languages() ### ['eng', 'letsgodigital'] lang = 'letsgodigital' # Hardcoding language because I'm being lazy ... ### roi: subsection of the original image. txt = tool.image_to_string(Image.fromarray(roi), lang=lang, builder=builders.TextBuilder())

My system: Ubuntu 14.04 Python 3.4.3

$ pip3.4 freeze | grep -E "ocr|tess" pyocr==0.4.0 pytesseract==0.1.6 tess==0.2 tesseract==0.1.3 tesseract-ocr==0.0.1

Edit: Added PIL and numpy references.

regards Morten S

arturaugusto commented 8 years ago

@sphinxligustri, good to know there is a simple solution. Could you share the entire code?

sphinxligustri commented 8 years ago

Here you go, code.py.txt

It's the bare minimum to get some results from my multimeter.

Some notes: I found the resize option handy for large images, but didn't attach it to a slider.

The image path is hard coded in at the moment. p_img = "./images/myImg.jpg"

btw: good job with the training set :+1:

arturaugusto commented 8 years ago

Thank you @sphinxligustri, I will test it and put on the repository as an alternative python3 compatible script, so others can get benefit of it.

byrann commented 8 years ago

hey, Great Job there !! i really like what you are doing guys. :+1: am using python3.4.3 under ubuntu too and the new script of display is still not working: the rectancle doesn't show up.
any ideas?? thanks !! @sphinxligustri @arturaugusto

Redwood38 commented 5 years ago

Late to the party, but managed to get it to work on raspberry pi (not zero, which is arm6 and doesn't support "opencv-python" package). Annotated changes to above code from @sphinx.

code.py.txt Cheers