sirfz / tesserocr

A Python wrapper for the tesseract-ocr API
MIT License
1.99k stars 254 forks source link

RuntimeError: Error reading image #51

Closed thiagofmam closed 7 years ago

thiagofmam commented 7 years ago

I checkout this project and run the test_api.py.

All tests that have test_image* failure

tesserocr/tests/test_api.py", line 70, in test_image_file
    self._api.SetImageFile(self._image_file)
  File "tesserocr.pyx", line 1545, in tesserocr.PyTessBaseAPI.SetImageFile (tesserocr.cpp:13568)
    raise RuntimeError('Error reading image')
RuntimeError: Error reading image
sirfz commented 7 years ago

Which version of tesserocr and tesseract are you using? How did you run the test and please show the complete output.

You can run the complete test by running python setup.py test.

thiagofmam commented 7 years ago

tesseract 3.05.00 and tesseract 4.00.00alpha

thiagofmam commented 7 years ago

running python3.6 setup.py test I got:

In file included from tesserocr.cpp:449:
In file included from /usr/local/include/tesseract/genericvector.h:27:
In file included from /usr/local/include/tesseract/tesscallback.h:22:
/usr/local/include/tesseract/host.h:30:10: fatal error: 'cinttypes' file not found
#include <cinttypes>    // PRId32, ...
         ^
1 error generated.
sirfz commented 7 years ago

Can't really help you with partial logs but first hint is that you have a problem compiling tesserocr, might be your gcc version?

jbdeboer commented 7 years ago

I just worked through this same issue. The error is due to leptonica's pixReadMem function returning null. In my case, it was because libpng was not installed correctly.

You can verify your setup using the tesseract_version() function. With my setup fixed, it now returns:

Tesseract version: tesseract 4.00.00alpha
 leptonica-1.74.1
  libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.51 : libtiff 4.0.3 : zlib 1.2.8 : libwebp 0.4.3

Previously, it did not list libpng.

sirfz commented 7 years ago

Good catch @jbdeboer, I completely missed it in the initial error message, should've paid more attention. Thanks.

sankha90 commented 4 years ago

I am trying to setImage using tesserocr: using this -

with PyTessBaseAPI() as api: for file in filename: api.Init(lang = 'eng') api.SetImageFile(file)

print (api.AllWordConfidences())

arr = list(api.AllWordConfidences()) sumarr = sum(arr) / float(len(arr))

The error I am getting :

Traceback (most recent call last): File "", line 4, in api.SetImageFile(file) File "tesserocr.pyx", line 1597, in tesserocr._tesserocr.PyTessBaseAPI.SetImageFile RuntimeError: Error reading image

And my version:

print(tesserocr.tesseract_version()) print(tesserocr.get_languages()) tesseract 4.0.0 leptonica-1.76.0 (Jan 8 2019, 13:34:23) [MSC v.1900 LIB Release x64] libgif 5.1.4 : libjpeg 9b : libpng 1.6.35 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 ('C:\Users\subhr\anaconda3\envs\ocr\/tessdata/', ['afr', 'amh', 'ara', 'asm', 'aze', 'aze_cyrl', 'bel', 'ben', 'bod', 'bos', 'bre', 'bul', 'cat', 'ceb', 'ces', 'chi_sim', 'chi_sim_vert', 'chi_tra', 'chi_tra_vert', 'chr', 'cos', 'cym', 'dan', 'deu', 'div', 'dzo', 'ell', 'eng', 'enm', 'epo', 'equ', 'est', 'eus', 'fao', 'fas', 'fil', 'fin', 'fra', 'frk', 'frm', 'fry', 'gla', 'gle', 'glg', 'grc', 'guj', 'hat', 'heb', 'hin', 'hrv', 'hun', 'hye', 'iku', 'ind', 'isl', 'ita', 'ita_old', 'jav', 'jpn', 'jpn_vert', 'kan', 'kat', 'kat_old', 'kaz', 'khm', 'kir', 'kmr', 'kor', 'lao', 'lat', 'lav', 'lit', 'ltz', 'mal', 'mar', 'mkd', 'mlt', 'mon', 'mri', 'msa', 'mya', 'nep', 'nld', 'nor', 'oci', 'ori', 'osd', 'pan', 'pol', 'por', 'pus', 'que', 'ron', 'rus', 'san', 'script/Arabic', 'script/Armenian', 'script/Bengali', 'script/Canadian_Aboriginal', 'script/Cherokee', 'script/Cyrillic', 'script/Devanagari', 'script/Ethiopic', 'script/Fraktur', 'script/Georgian', 'script/Greek', 'script/Gujarati', 'script/Gurmukhi', 'script/HanS', 'script/HanS_vert', 'script/HanT', 'script/HanT_vert', 'script/Hangul', 'script/Hangul_vert', 'script/Hebrew', 'script/Japanese', 'script/Japanese_vert', 'script/Kannada', 'script/Khmer', 'script/Lao', 'script/Latin', 'script/Malayalam', 'script/Myanmar', 'script/Oriya', 'script/Sinhala', 'script/Syriac', 'script/Tamil', 'script/Telugu', 'script/Thaana', 'script/Thai', 'script/Tibetan', 'script/Vietnamese', 'sin', 'slk', 'slv', 'snd', 'spa', 'spa_old', 'sqi', 'srp', 'srp_latn', 'sun', 'swa', 'swe', 'syr', 'tam', 'tat', 'tel', 'tgk', 'tha', 'tir', 'ton', 'tur', 'uig', 'ukr', 'urd', 'uzb', 'uzb_cyrl', 'vie', 'yid', 'yor'])

kba commented 4 years ago

@sankha90 please don't necropost. #234