dbousque / batch_jaro_winkler

Fast batch jaro winkler distance implementation in C99 with Ruby, OCaml and Python bindings.
MIT License
26 stars 4 forks source link

Windows 10 compability issues #1

Open nkaenzig opened 3 years ago

nkaenzig commented 3 years ago

On Linux Ubuntu the module works perfectly, however when trying to install it on a windows 10 machine with anaconda (python 3.8.5), I get a C compiler error:


cbatch_jaro_winkler.c:219:12: error: enumerator value for '__pyx_check_sizeof_voidp' is not an integer constant
         enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };

I installed the GCC compiler suite on windows using MinGW, following these instructions: https://cython.readthedocs.io/en/latest/src/tutorial/appendix.html

Do you have an idea how we could achieve compability for windows?

Here is the full output i currently get on pip install:


(testenv) C:\Users\nkaen>pip install batch-jaro-winkler
Collecting batch-jaro-winkler
  Using cached batch_jaro_winkler-0.1.0.tar.gz (84 kB)
Building wheels for collected packages: batch-jaro-winkler
  Building wheel for batch-jaro-winkler (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'C:\Users\nkaen\Anaconda3\envs\testenv\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"'; __file__='"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\nkaen\AppData\Local\Temp\pip-wheel-ed4u3b2i'
       cwd: C:\Users\nkaen\AppData\Local\Temp\pip-install-d5lfqbth\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\
  Complete output (38 lines):
  running bdist_wheel
  running build
  running build_ext
  building 'batch_jaro_winkler' extension
  creating build
  creating build\temp.win-amd64-3.8
  creating build\temp.win-amd64-3.8\Release
  creating build\temp.win-amd64-3.8\Release\ext
  C:\MinGW\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -IC:\Users\nkaen\Anaconda3\envs\testenv\include -IC:\Users\nkaen\Anaconda3\envs\testenv\include -c cbatch_jaro_winkler.c -o build\temp.win-amd64-3.8\Release\cbatch_jaro_winkler.o
  In file included from C:\Users\nkaen\Anaconda3\envs\testenv\include/Python.h:85:0,
                   from cbatch_jaro_winkler.c:22:
  C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:123:59: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
   PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv);
                                                             ^~~~~~~
  C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:130:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
       struct timeval *tv,
              ^~~~~~~
  C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:135:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
       struct timeval *tv,
              ^~~~~~~
  cbatch_jaro_winkler.c:219:41: warning: division by zero [-Wdiv-by-zero]
       enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
                                           ^
  cbatch_jaro_winkler.c:219:12: error: enumerator value for '__pyx_check_sizeof_voidp' is not an integer constant
       enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
              ^~~~~~~~~~~~~~~~~~~~~~~~
  cbatch_jaro_winkler.c: In function '__Pyx_modinit_type_init_code':
  cbatch_jaro_winkler.c:5233:3: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
     __pyx_type_18batch_jaro_winkler_RuntimeModel.tp_print = 0;
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  In file included from C:\Users\nkaen\Anaconda3\envs\testenv\include/object.h:746:0,
                   from C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:6,
                   from C:\Users\nkaen\Anaconda3\envs\testenv\include/Python.h:85,
                   from cbatch_jaro_winkler.c:22:
  C:\Users\nkaen\Anaconda3\envs\testenv\include/cpython/object.h:260:30: note: declared here
       Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
                                ^~~~~~~~
  error: command 'C:\\MinGW\\bin\\gcc.exe' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for batch-jaro-winkler
  Running setup.py clean for batch-jaro-winkler
Failed to build batch-jaro-winkler
Installing collected packages: batch-jaro-winkler
    Running setup.py install for batch-jaro-winkler ... error
    ERROR: Command errored out with exit status 1:
     command: 'C:\Users\nkaen\Anaconda3\envs\testenv\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"'; __file__='"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\nkaen\AppData\Local\Temp\pip-record-jc27xax2\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\nkaen\Anaconda3\envs\testenv\Include\batch-jaro-winkler'
         cwd: C:\Users\nkaen\AppData\Local\Temp\pip-install-d5lfqbth\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\
    Complete output (38 lines):
    running install
    running build
    running build_ext
    building 'batch_jaro_winkler' extension
    creating build
    creating build\temp.win-amd64-3.8
    creating build\temp.win-amd64-3.8\Release
    creating build\temp.win-amd64-3.8\Release\ext
    C:\MinGW\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -IC:\Users\nkaen\Anaconda3\envs\testenv\include -IC:\Users\nkaen\Anaconda3\envs\testenv\include -c cbatch_jaro_winkler.c -o build\temp.win-amd64-3.8\Release\cbatch_jaro_winkler.o
    In file included from C:\Users\nkaen\Anaconda3\envs\testenv\include/Python.h:85:0,
                     from cbatch_jaro_winkler.c:22:
    C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:123:59: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
     PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv);
                                                               ^~~~~~~
    C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:130:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
         struct timeval *tv,
                ^~~~~~~
    C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:135:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
         struct timeval *tv,
                ^~~~~~~
    cbatch_jaro_winkler.c:219:41: warning: division by zero [-Wdiv-by-zero]
         enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
                                             ^
    cbatch_jaro_winkler.c:219:12: error: enumerator value for '__pyx_check_sizeof_voidp' is not an integer constant
         enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
                ^~~~~~~~~~~~~~~~~~~~~~~~
    cbatch_jaro_winkler.c: In function '__Pyx_modinit_type_init_code':
    cbatch_jaro_winkler.c:5233:3: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
       __pyx_type_18batch_jaro_winkler_RuntimeModel.tp_print = 0;
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In file included from C:\Users\nkaen\Anaconda3\envs\testenv\include/object.h:746:0,
                     from C:\Users\nkaen\Anaconda3\envs\testenv\include/pytime.h:6,
                     from C:\Users\nkaen\Anaconda3\envs\testenv\include/Python.h:85,
                     from cbatch_jaro_winkler.c:22:
    C:\Users\nkaen\Anaconda3\envs\testenv\include/cpython/object.h:260:30: note: declared here
         Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
                                  ^~~~~~~~
    error: command 'C:\\MinGW\\bin\\gcc.exe' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'C:\Users\nkaen\Anaconda3\envs\testenv\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"'; __file__='"'"'C:\\Users\\nkaen\\AppData\\Local\\Temp\\pip-install-d5lfqbth\\batch-jaro-winkler_d3702fa170ec43b7bb7f25c5153232f3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\nkaen\AppData\Local\Temp\pip-record-jc27xax2\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\nkaen\Anaconda3\envs\testenv\Include\batch-jaro-winkler' Check the logs for full command output.
dbousque commented 3 years ago

Hi @nkaenzig , thanks for the report!

Can you try to run python setup.py install (use the python interpreter you want to install the library for, so maybe python3 for example) from the python folder? You should expect the issue to happen again. Can you tell me if this builds correctly: pip install -r dev-requirements.py, then python cython_build.py build_ext --inplace and then python setup.py install again.

The rationale would be that the C file generated by Cython on my local machine might only be working on Unix like systems. I didn't want to include cython as a dependency, but we might have to.

nkaenzig commented 3 years ago

Hi @dbousque, thanks for the quick response.

Running python setup.py install indeed results in the same error.

Running pip install -r dev-requirements.py didn't work first, apparently the listed Cython version (0.29.6) wasn't compatible with my current python version/environment. Now created a new environment with python 3.7.9, there I could install Cython==0.29.6. But when runnng python cython_build.py build_ext --inplace I still get about the same error:

(testenv1) C:\Users\nkaen\workspace\new\batch_jaro_winkler\python>python cython_build.py build_ext --inplace
running build_ext
building 'batch_jaro_winkler' extension
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
C:\MinGW\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -IC:\Users\nkaen\Anaconda3\envs\testenv1\include -IC:\Users\nkaen\Anaconda3\envs\testenv1\include -c cbatch_jaro_winkler.c -o build\temp.win-amd64-3.7\Release\cbatch_jaro_winkler.o
In file included from C:\Users\nkaen\Anaconda3\envs\testenv1\include/Python.h:87:0,
                 from cbatch_jaro_winkler.c:22:
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:123:59: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
 PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv);
                                                           ^~~~~~~
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:130:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
     struct timeval *tv,
            ^~~~~~~
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:135:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
     struct timeval *tv,
            ^~~~~~~
cbatch_jaro_winkler.c:219:41: warning: division by zero [-Wdiv-by-zero]
     enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
                                         ^
cbatch_jaro_winkler.c:219:12: error: enumerator value for '__pyx_check_sizeof_voidp' is not an integer constant
     enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
            ^~~~~~~~~~~~~~~~~~~~~~~~
cbatch_jaro_winkler.c:610:36: fatal error: ext/batch_jaro_winkler.h: No such file or directory
 #include "ext/batch_jaro_winkler.h"
                                    ^
compilation terminated.
error: command 'C:\\MinGW\\bin\\gcc.exe' failed with exit status 1

Let me know how I can help to resolve that issue. Never worked with cython before..

dbousque commented 3 years ago

Ok apparently it would be an issue with Cython/MinGW interop: https://github.com/cython/cython/issues/2670#issuecomment-432212671.

Try to run python cython_build.py build_ext --inplace -DMS_WIN64 (with -DMS_WIN64) then. If that solves it, I guess that you will get a new cbatch_jaro_winkler.c file that is different from the original one. It would be really nice if you could make a new cython_mingw_build.py file that is similar to cython_build.py but has a different target name (cbatch_jaro_winkler.pyx -> cbatch_jaro_winkler_mingw.pyx maybe) and have a check in setup.py to either use cbatch_jaro_winkler.c or cbatch_jaro_winkler_mingw.c depending on whether MinGW is used or not. I would have a hard time trying it out myself, I could try to setup an environment similar to yours but that may be painful.

If everything is ok, running python setup.py install should work correctly. Regarding the missing ext/batch_jaro_winkler.h header, I forgot to tell you to copy the lib folder from the root of the git project and paste it as ext in the python folder.

Let me know if I can help.

nkaenzig commented 3 years ago

Just tried out python cython_build.py build_ext --inplace -DMS_WIN64, but still getting the error :(

I've been googling the error as well, and as you did the only response I could find was the -DMS_WIN64 option.. Do you have an idea of something else we could try?

(testenv1) C:\Users\nkaen\workspace\new\batch_jaro_winkler\python>python cython_build.py build_ext --inplace -DMS_WIN64
running build_ext
building 'batch_jaro_winkler' extension
C:\MinGW\bin\gcc.exe -mdll -O -Wall -DMS_WIN64 -DMS_WIN64=1 -I. -IC:\Users\nkaen\Anaconda3\envs\testenv1\include -IC:\Users\nkaen\Anaconda3\envs\testenv1\include -c cbatch_jaro_winkler.c -o build\temp.win-amd64-3.7\Release\cbatch_jaro_winkler.o
In file included from C:\Users\nkaen\Anaconda3\envs\testenv1\include/Python.h:87:0,
                 from cbatch_jaro_winkler.c:22:
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:123:59: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
 PyAPI_FUNC(int) _PyTime_FromTimeval(_PyTime_t *tp, struct timeval *tv);
                                                           ^~~~~~~
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:130:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
     struct timeval *tv,
            ^~~~~~~
C:\Users\nkaen\Anaconda3\envs\testenv1\include/pytime.h:135:12: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
     struct timeval *tv,
            ^~~~~~~
cbatch_jaro_winkler.c:219:41: warning: division by zero [-Wdiv-by-zero]
     enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
                                         ^
cbatch_jaro_winkler.c:219:12: error: enumerator value for '__pyx_check_sizeof_voidp' is not an integer constant
     enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
            ^~~~~~~~~~~~~~~~~~~~~~~~
error: command 'C:\\MinGW\\bin\\gcc.exe' failed with exit status 1
nkaenzig commented 3 years ago

Just found this stackoverflow post, which might be relevant: https://stackoverflow.com/questions/55995160/unable-to-convert-cython-generated-c-language-code-to-executable-file

dbousque commented 3 years ago

Yep, you could try python cython_build.py build_ext --inplace -DSIZEOF_VOID_P=8 -DMS_WIN64, it would sound logical that it solves the issue given the error is a complaint that __pyx_check_sizeof_voidp is not an integer constant.

nkaenzig commented 3 years ago

Already tried both options -DSIZEOF_VOID_P=8 and -DSIZEOF_VOID_P=4, doesn't work either :(

dbousque commented 3 years ago

Ah, I just saw that it looks like python cython_build.py is not trying to recreate the cbatch_jaro_winkler.c file, you probably need to remove this file. If that doesn't do it, maybe the command line arguments to python cython_build.py are not passed when doing the extension compilation. In cython_build.py, maybe try to replace:

ext_modules=cythonize([Extension('batch_jaro_winkler', ['cbatch_jaro_winkler.pyx'])], language_level=python_version)

with:

ext_modules=cythonize([Extension('batch_jaro_winkler', ['cbatch_jaro_winkler.pyx'], define_macros=[('SIZEOF_VOID_P', '8'), ('MS_WIN64', '1')])], language_level=python_version)

Additionally, try to clean the environment before/after each try, like we do in local_build.sh and cython_build.sh (remove files and folders mainly). Try to get a build flow as similar as local_build.sh as possible, maybe adding the C file deletion each time as well.

nkaenzig commented 3 years ago

Just tried that out, unfortunately still getting that same error

dbousque commented 3 years ago

I tried to debug for MinGW on my Windows machine but it's a mess. The default Windows C compiler fails for another reason (can't find the size of void * 🤷 ). It looks like building native extensions (maybe Cython extensions in particular) is a mess on Windows: https://wiki.python.org/moin/WindowsCompilers#GCC_-, and I don't think there would be a clean way to support most exotic combinations of Python versions / C compilers (MSVC/mingw32/gcc) / Development environments (like MinGW, virtual envs etc.).

If that works for you, you could go with "Ubuntu for Windows". Or if you really need to get this library working on Windows, things may work well if you go for a standard combination of Python version + MSVC version + standard terminal and PATH setup etc. (whatever that may be) like mentioned in the link from the official CPython website.

nkaenzig commented 3 years ago

I see, thanks a lot for looking into it. For now I will run it in WSL.

dbousque commented 3 years ago

Thanks for the report and investigation efforts, have a nice day!

gcleaves commented 3 months ago

Hi. I have a similar problem on Mac OS.