sirfz / tesserocr

A Python wrapper for the tesseract-ocr API
MIT License
1.98k stars 254 forks source link

Failure to build with leptonica 1.83 #314

Open risicle opened 1 year ago

risicle commented 1 year ago

Leptonica 1.83 moved a number of struct definitions into "private" headers, notably Pix and Box et al.

This results in a build failure:

tesserocr.cpp: In function ‘PyObject* __pyx_f_9tesserocr__pix_to_image(Pix*)’:
tesserocr.cpp:6685:26: error: invalid use of incomplete type ‘struct Pix’
 6685 |   __pyx_t_1 = __pyx_v_pix->informat;
      |                          ^~

which requires the inclusion of the header leptonica/pix_internal.h to overcome. This can be done with a supplemental cdef extern from statement as detailed in https://cython.readthedocs.io/en/latest/src/userguide/external_C_code.html#referencing-c-header-files @ tesseract.pxd

cdef extern from "leptonica/allheaders.h" nogil:
    pass

cdef extern from "leptonica/pix_internal.h" nogil:
    struct Pix:

I haven't made this a PR with the suggested change because it will presumably need to be conditionally done based on the result of a version test, but the above fix does seem to work.

sirfz commented 1 year ago

There already is a version check on the tesseract version passed as compile-time environment variables that yields different cpp code based on some cython macro checks, you should be able to apply the same concept for leptonica version-specific changes.