openpaperwork / pyocr

A Python wrapper for Tesseract and Cuneiform -- Moved to Gnome's Gitlab
https://gitlab.gnome.org/World/OpenPaperwork/pyocr
930 stars 152 forks source link

AttributeError: 'module' object has no attribute 'get_available_tools' #14

Closed asennoussi closed 10 years ago

asennoussi commented 10 years ago

Hello guys ! I've created a test file in a separate folder : my code

from PIL import Image
import sys
import pyocr
import pyocr.builders

tools = pyocr.get_available_tools()
if len(tools) == 0:
    print("No OCR tool found")
    sys.exit(1)
tool = tools[0]
print("Will use tool '%s'" % (tool.get_name()))
# Ex: Will use tool 'tesseract'

langs = tool.get_available_languages()
print("Available languages: %s" % ", ".join(langs))
lang = langs[0]
print("Will use lang '%s'" % (lang))
# Ex: Will use lang 'fra'

txt = tool.image_to_string(Image.open('http://www.domain.com/fr/i/3518721/phone'),
                           lang=lang,
                           builder=pyocr.builders.TextBuilder())
word_boxes = tool.image_to_string(Image.open('http://www.domain.com/fr/i/3518721/phone'),
                                  lang=lang,
                                  builder=pyocr.builders.WordBoxBuilder())
line_and_word_boxes = tool.image_to_string(
        Image.open('test.png'), lang=lang,
        builder=pyocr.builders.LineBoxBuilder())

and I get this error message

Traceback (most recent call last):
  File "./test.py", line 6, in <module>
    tools = pyocr.get_available_tools()
AttributeError: 'module' object has no attribute 'get_available_tools'

any Idea ?

jflesch commented 10 years ago

1) Do you use Python 3 or Python 2 ? 2) Is your installation of Pyocr up-to-date ? (print(pyocr.VERSION) should display "(0, 2, 2)")

asennoussi commented 10 years ago

I'm using Python 2.7.3 I have just installed pyocr 0.2.2

jflesch commented 10 years ago

Hm, weird. Works for me.

1) How did you install Pyocr exactly? 2) Which OS / GNU/Linux distribution do you use ?

asennoussi commented 10 years ago

When I use the command print(pyocr.VERSION) it said AttributeError: 'module' object has no attribute 'VERSION' I'm using Debian

asennoussi commented 10 years ago

I will try to reinstall it and I'll be back to you

jflesch commented 10 years ago

What you can try also:

from pyocr import pyocr

But you shouldn't have to import it this way.

asennoussi commented 10 years ago

Reinstallation made it work correctly ! but recognition for numbers is really poor though , it reads 6 as a 5 , can I improve the results ?

asennoussi commented 10 years ago

test My example

jflesch commented 10 years ago

Recognition quality depends on a lot of things but not Pyocr itself, so I won't be able to help you much for that. Maybe you can try training Tesseract for the texts you're trying to make it read. I'm just not how you can integrate your training easily with Pyocr.

Good luck.