sirfz / tesserocr

A Python wrapper for the tesseract-ocr API
MIT License
2.02k stars 255 forks source link

Python3.11 support #298

Closed SmartManoj closed 1 year ago

SmartManoj commented 2 years ago
Defaulting to user installation because normal site-packages is not writeable
Collecting tesserocr
  Using cached tesserocr-2.5.2.tar.gz (57 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: tesserocr
  Building wheel for tesserocr (setup.py): started
  Building wheel for tesserocr (setup.py): finished with status 'error'
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [126 lines of output]
      Supporting tesseract v5.1.0-17-g6814
      Tesseract major version 5
      Configs from pkg-config: {'library_dirs': [], 'include_dirs': ['/usr/include', '/usr/include'], 'libraries': ['tesseract', 'archive', 'curl', 'lept'], 'compile_time_env': {'TESSERACT_MAJOR_VERSION': 5, 'TESSERACT_VERSION': 83951616}}
      running bdist_wheel
      running build
      running build_ext
      Detected compiler: unix
      building 'tesserocr' extension
      creating build
      creating build/temp.linux-x86_64-3.11
      x86_64-linux-gnu-gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include -I/usr/include -I/usr/include/python3.11 -I/usr/local/include/python3.11 -c tesserocr.cpp -o build/temp.linux-x86_64-3.11/tesserocr.o -std=c++11 -DUSE_STD_NAMESPACE
      tesserocr.cpp: In function ‘PyObject* __pyx_pf_9tesserocr_13PyTessBaseAPI_34GetAvailableLanguages(__pyx_obj_9tesserocr_PyTessBaseAPI*)’:
      tesserocr.cpp:17031:35: warning: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
      17031 |     for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {
            |                         ~~~~~~~~~~^~~~~~~~~~~
      tesserocr.cpp: In function ‘PyObject* __pyx_pf_9tesserocr_12get_languages(PyObject*, PyObject*)’:
      tesserocr.cpp:28146:35: warning: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
      28146 |     for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) {
            |                         ~~~~~~~~~~^~~~~~~~~~~
      tesserocr.cpp: In function ‘int __Pyx_PyBytes_Equals(PyObject*, PyObject*, int)’:
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp: In function ‘void __Pyx_AddTraceback(const char*, int, int, const char*)’:
      tesserocr.cpp:487:62: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
        487 |   #define __Pyx_PyFrame_SetLineNumber(frame, lineno)  (frame)->f_lineno = (lineno)
            |                                                              ^~
      tesserocr.cpp:39234:5: note: in expansion of macro ‘__Pyx_PyFrame_SetLineNumber’
      39234 |     __Pyx_PyFrame_SetLineNumber(py_frame, py_line);
            |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      tesserocr.cpp: In function ‘PyObject* __Pyx_Coroutine_SendEx(__pyx_CoroutineObject*, PyObject*, int)’:
      tesserocr.cpp:41212:14: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41212 |             f->f_back = PyThreadState_GetFrame(tstate);
            |              ^~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      In file included from /usr/include/python3.11/Python.h:44,
                       from tesserocr.cpp:41:
      tesserocr.cpp: In function ‘void __Pyx_Coroutine_ResetFrameBackpointer(__Pyx_ExcInfoStruct*)’:
      tesserocr.cpp:41249:19: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41249 |         Py_CLEAR(f->f_back);
            |                   ^~
      /usr/include/python3.11/object.h:107:41: note: in definition of macro ‘_PyObject_CAST’
        107 | #define _PyObject_CAST(op) ((PyObject*)(op))
            |                                         ^~
      tesserocr.cpp:41249:9: note: in expansion of macro ‘Py_CLEAR’
      41249 |         Py_CLEAR(f->f_back);
            |         ^~~~~~~~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      In file included from /usr/include/python3.11/Python.h:44,
                       from tesserocr.cpp:41:
      tesserocr.cpp:41249:19: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41249 |         Py_CLEAR(f->f_back);
            |                   ^~
      /usr/include/python3.11/object.h:566:14: note: in definition of macro ‘Py_CLEAR’
        566 |             (op) = NULL;                        \
            |              ^~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for tesserocr
  Running setup.py clean for tesserocr
Failed to build tesserocr
Installing collected packages: tesserocr
  Running setup.py install for tesserocr: started
  Running setup.py install for tesserocr: finished with status 'error'
  error: subprocess-exited-with-error

  × Running setup.py install for tesserocr did not run successfully.
  │ exit code: 1
  ╰─> [126 lines of output]
      Supporting tesseract v5.1.0-17-g6814
      Tesseract major version 5
      Configs from pkg-config: {'library_dirs': [], 'include_dirs': ['/usr/include', '/usr/include'], 'libraries': ['tesseract', 'archive', 'curl', 'lept'], 'compile_time_env': {'TESSERACT_MAJOR_VERSION': 5, 'TESSERACT_VERSION': 83951616}}
      running install
      running build
      running build_ext
      Detected compiler: unix
      building 'tesserocr' extension
      creating build
      creating build/temp.linux-x86_64-3.11
      x86_64-linux-gnu-gcc -pthread -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include -I/usr/include -I/usr/include/python3.11 -I/usr/local/include/python3.11 -c tesserocr.cpp -o build/temp.linux-x86_64-3.11/tesserocr.o -std=c++11 -DUSE_STD_NAMESPACE
      tesserocr.cpp: In function ‘PyObject* __pyx_pf_9tesserocr_13PyTessBaseAPI_34GetAvailableLanguages(__pyx_obj_9tesserocr_PyTessBaseAPI*)’:
      tesserocr.cpp:17031:35: warning: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
      17031 |     for (__pyx_t_4 = 0; __pyx_t_4 < __pyx_t_3; __pyx_t_4+=1) {
            |                         ~~~~~~~~~~^~~~~~~~~~~
      tesserocr.cpp: In function ‘PyObject* __pyx_pf_9tesserocr_12get_languages(PyObject*, PyObject*)’:
      tesserocr.cpp:28146:35: warning: comparison of integer expressions of different signedness: ‘int’ and ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘long unsigned int’} [-Wsign-compare]
      28146 |     for (__pyx_t_5 = 0; __pyx_t_5 < __pyx_t_4; __pyx_t_5+=1) {
            |                         ~~~~~~~~~~^~~~~~~~~~~
      tesserocr.cpp: In function ‘int __Pyx_PyBytes_Equals(PyObject*, PyObject*, int)’:
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38610:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38610 |             hash1 = ((PyBytesObject*)s1)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp:38611:43: warning: ‘PyBytesObject::ob_shash’ is deprecated [-Wdeprecated-declarations]
      38611 |             hash2 = ((PyBytesObject*)s2)->ob_shash;
            |                                           ^~~~~~~~
      In file included from /usr/include/python3.11/bytesobject.h:62,
                       from /usr/include/python3.11/Python.h:50,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/cpython/bytesobject.h:7:35: note: declared here
          7 |     Py_DEPRECATED(3.11) Py_hash_t ob_shash;
            |                                   ^~~~~~~~
      tesserocr.cpp: In function ‘void __Pyx_AddTraceback(const char*, int, int, const char*)’:
      tesserocr.cpp:487:62: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
        487 |   #define __Pyx_PyFrame_SetLineNumber(frame, lineno)  (frame)->f_lineno = (lineno)
            |                                                              ^~
      tesserocr.cpp:39234:5: note: in expansion of macro ‘__Pyx_PyFrame_SetLineNumber’
      39234 |     __Pyx_PyFrame_SetLineNumber(py_frame, py_line);
            |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      tesserocr.cpp: In function ‘PyObject* __Pyx_Coroutine_SendEx(__pyx_CoroutineObject*, PyObject*, int)’:
      tesserocr.cpp:41212:14: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41212 |             f->f_back = PyThreadState_GetFrame(tstate);
            |              ^~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      In file included from /usr/include/python3.11/Python.h:44,
                       from tesserocr.cpp:41:
      tesserocr.cpp: In function ‘void __Pyx_Coroutine_ResetFrameBackpointer(__Pyx_ExcInfoStruct*)’:
      tesserocr.cpp:41249:19: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41249 |         Py_CLEAR(f->f_back);
            |                   ^~
      /usr/include/python3.11/object.h:107:41: note: in definition of macro ‘_PyObject_CAST’
        107 | #define _PyObject_CAST(op) ((PyObject*)(op))
            |                                         ^~
      tesserocr.cpp:41249:9: note: in expansion of macro ‘Py_CLEAR’
      41249 |         Py_CLEAR(f->f_back);
            |         ^~~~~~~~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      In file included from /usr/include/python3.11/Python.h:44,
                       from tesserocr.cpp:41:
      tesserocr.cpp:41249:19: error: invalid use of incomplete type ‘PyFrameObject’ {aka ‘struct _frame’}
      41249 |         Py_CLEAR(f->f_back);
            |                   ^~
      /usr/include/python3.11/object.h:566:14: note: in definition of macro ‘Py_CLEAR’
        566 |             (op) = NULL;                        \
            |              ^~
      In file included from /usr/include/python3.11/Python.h:42,
                       from tesserocr.cpp:41:
      /usr/include/python3.11/pytypedefs.h:22:16: note: forward declaration of ‘PyFrameObject’ {aka ‘struct _frame’}
         22 | typedef struct _frame PyFrameObject;
            |                ^~~~~~
      error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> tesserocr

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.
zdenop commented 2 years ago

https://docs.python.org/3.11/whatsnew/3.11.html:

The PyFrameObject structure members have been removed from the public C API.

While the documentation notes that the PyFrameObject fields are subject to change at any time, they have been stable for a long time and were used in several popular extensions.

In Python 3.11, the frame struct was reorganized to allow performance optimizations. Some fields were removed entirely, as they were details of the old implementation.

PyFrameObject fields:

sirfz commented 2 years ago

We'll have to wait for Cython to support Python 3.11 as the generated C++ code is done by Cython

miguel-lorenzo commented 1 year ago

Hello @sirfz,

I think Cython is now supporting Python 3.11.

Are you planning to support it in tesserocr as well?

sirfz commented 1 year ago

Did you try building and it failed?

fgsalomon commented 1 year ago

Did you try building and it failed?

The build doesn't fail and tesserocr works as expected. I'd say that this issue can be closed.

miguel-lorenzo commented 1 year ago

yep!