Closed asfimport closed 4 years ago
Krisztian Szucs / @kszucs: @efiop How did you install 0.14.1? Via conda or via pip? Seems like via pip.
Ruslan Kuprieiev / @efiop: @kszucs Pip that comes with vanilla official Python for windows.
Krisztian Szucs / @kszucs: I suggest you to use 0.14 or conda until it gets resolved.
Ruslan Kuprieiev / @efiop: @kszucs Yep, did that already :) Thanks!
Krisztian Szucs / @kszucs: I suspect that the missing dso is zlib which is not bundled anymore with the wheels https://github.com/apache/arrow/pull/4886/files#diff-8cf6167d58ce775a08acafcfe6f40966L388 and it is linked dynamically instead of the intended static linkage https://github.com/apache/arrow/pull/4886/files#diff-647dde013daa22cab04c2707a6f611e5R57.
Krisztian Szucs / @kszucs: We can also rebuild the windows wheels, but don't know how to *force* static zlib linkage. cc @pitrou
Antoine Pitrou / @pitrou: I don't know either. On manylinux we hack around this by removing the zlib.so.
Krisztian Szucs / @kszucs: I'm trying to rebuild the windows wheels with bundled zlib.dll https://github.com/ursa-labs/crossbow/branches/all?query=build-669
Krisztian Szucs / @kszucs: The produced wheels are going to be available at the following links:
Wes McKinney / @wesm: This is definitely sad. Do we need to remove the wheels from PyPI? I don't think we should do a 0.14.2 release to fix this
Krisztian Szucs / @kszucs: Agree with the removal of 0.14.1 windows wheels. I don't have access to do that though.
Wes McKinney / @wesm: You're listed as a maintainer on https://pypi.org/project/pyarrow/, you should be able to remove them in the web UI if you are logged in
Wes McKinney / @wesm: Please also remove the 0.14.0 Windows wheels
Krisztian Szucs / @kszucs: Why should we remove the 0.14.0 wheels?
Wes McKinney / @wesm: Surely they have the exact same problem, unless a patch was cherry-picked that altered the behavior?
Krisztian Szucs / @kszucs: The patch was cherry picked, so this issue doesn't affect the 0.14 wheels.
Wes McKinney / @wesm:
I see. I'm surprised that https://github.com/apache/arrow/commit/befd7dfe18ec8d362c0472092d48edbb8df9c3b8 caused the libraries to have a dependency on zlib.dll. In theory -DZLIB_SOURCE=BUNDLED
should result in a statically-linked version
Krisztian Szucs / @kszucs:
Thought the same. Apparently cmake picks up the dynamic library if it is locatable, no matter of zlib_SOURCE
. We'd need a way to force static linkage.
Antoine Pitrou / @pitrou: Right. CMake does not have a way of saying "prefer the static lib if present, but fall back on the dynamic lib otherwise".
Antoine Pitrou / @pitrou: I confirm that the 3.6 wheel seems to load zlib.dll from the PyArrow install.
Antoine Pitrou / @pitrou: I confirm that the 3.6 wheel seems to load zlib.dll from the PyArrow install.
Krisztian Szucs / @kszucs: But we don't really have any way to publish is, right?
Antoine Pitrou / @pitrou: I don't think so?
Krisztian Szucs / @kszucs: I'm afraid the same issue affects the OSX wheels:
libarrow.14.dylib:
@rpath/libarrow.14.dylib (compatibility version 14.0.0, current version 14.1.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.8)
@rpath/libarrow_boost_system.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libarrow_boost_filesystem.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libarrow_boost_regex.dylib (compatibility version 0.0.0, current version 0.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 307.5.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.50.2)
although they will work most of the cases because the same problem was present in the previous wheels, we linked the same way in 0.14.0:
libarrow.14.dylib:
@rpath/libarrow.14.dylib (compatibility version 14.0.0, current version 14.0.0)
/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/local/opt/openssl/lib/libssl.1.0.0.dylib (compatibility version 1.0.0, current version 1.0.0)
/usr/lib/libz.1.dylib (compatibility version 1.0.0, current version 1.2.8)
@rpath/libarrow_boost_system.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libarrow_boost_filesystem.dylib (compatibility version 0.0.0, current version 0.0.0)
@rpath/libarrow_boost_regex.dylib (compatibility version 0.0.0, current version 0.0.0)
/usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 307.5.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.50.2)
This problem should have been captured automatically by https://github.com/apache/arrow/blob/master/dev/tasks/python-wheels/travis.osx.yml#L77 But has travis swallowed the following errors:
[0K$ sudo find /usr -name libz.* -delete
find: -delete: unlink(/usr/lib/libz.1.1.3.dylib): Operation not permitted
find: -delete: unlink(/usr/lib/libz.1.2.5.dylib): Operation not permitted
find: -delete: unlink(/usr/lib/libz.1.2.8.dylib): Operation not permitted
find: -delete: unlink(/usr/lib/libz.1.dylib): Operation not permitted
find: -delete: unlink(/usr/lib/libz.dylib): Operation not permitted
Which is only available after loading the whole raw log: https://api.travis-ci.org/v3/job/559681560/log.txt
Krisztian Szucs / @kszucs: This is extremely annoying, I can revert the windows and OSX parts of https://github.com/apache/arrow/pull/4886 to bundle zlib DSO.
Antoine Pitrou / @pitrou: You mean the zlib isn't always available on macOS?
Krisztian Szucs / @kszucs: Since OSX mojave it is not shipped by default, might just be the headers though.
Kazuaki Ishizaki / @kiszk: I cannot reproduce this issue on my Windows 10 environment by using two pythons (conda and python) with this whl Do I miss something to reproduce this failure?
$ wget https://www.python.org/ftp/python/3.7.4/python-3.7.4-embed-amd64.zip
$ unzip python-3.7.4-embed-amd64.zip
$ cd python-3.7.4-embed-amd64
$ wget https://bootstrap.pypa.io/get-pip.py
$ python get-pip.py
$ wget pyarrow-0.14.1-cp37-cp37m-win_amd64.whl
$ python -m pip install pyarrow-0.14.1-cp37-cp37m-win_amd64.whl
...
Successfully installed numpy-1.17.1 pyarrow-0.14.1 six-1.12.0
$ python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> print (pyarrow.cpu_count())
4
>>>
$ activate arrow-dev
$ wget pyarrow-0.14.1-cp37-cp37m-win_amd64.whl
$ pip install pyarrow-0.14.1-cp37-cp37m-win_amd64.whl
...
Installing collected packages: pyarrow
Successfully installed pyarrow-0.14.1
>python
Python 3.7.3 | packaged by conda-forge | (default, Jul 1 2019, 22:01:29) [MSC v.1900 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> print (pyarrow.cpu_count())
4
>>>
Antoine Pitrou / @pitrou: This can of issue depends on which DLLs are already installed on your system. So if the wheel is missing e.g. some compression libraries (such as zstd or brotli) but you have them on your system already, the wheel will work fine for you. This is also what makes it more difficult to ensure that Windows wheels are correctly generated...
Kazuaki Ishizaki / @kiszk: I see. Thank you for your quick response. It looks more complex.
Have we already identified which libraries are missed when this failure occurs? Or, haven't we identified yet?
Kazuaki Ishizaki / @kiszk: I believe that I identified how to fix this issue.
To install Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019.
from https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads avoids this error.
I think that this problem does not occur with conda. This problem occurs with only pip.
The following is my validation step. If someone double-checks it, we would appreciate it.
// Install Windows10 enterprise (no additional application is installed)
> mkdir c:\pyarrow
> cd c:\pyarrow
> bitsadmin /TRANSFER htmlget https://www.python.org/ftp/python/3.7.4/python-3.7.4-embed-amd64.zip c:\pyarrow\python-3.7.4-embed-amd64.zip
extract all python-3.7.4-embed-amd64.zip to c:\pyarrow\python-3.7.4-embed-amd64 from Explorer
> cd python-3.7.4-embed-amd64
notepad python37._pth
...
#import site <=== remove # in this line
> type python37._pth
python37.zip
.
# Uncomment to run site.main() automatically
import site
> bitsadmin /TRANSFER htmlget https://bootstrap.pypa.io/get-pip.py c:\pyarrow\python-3.7.4-embed-amd64\get-pip.py
> python get-pip.py
...
Successfully installed pip-19.2.3 setuptools-41.2.0 wheel-0.33.6
> python -m pip install pyarrow
Collecting pyarrow
Downloading https://files.pythonhosted.org/packages/97/7c/0ea4554d64c6ed3d6d4f8da492df287d2496adbab2b35c01433cf1344521/pyarrow-0.14.0-cp37-cp37m-win_amd64.whl (17.4MB)
...
Collecting numpy>=1.14 (from pyarrow)
Downloading https://files.pythonhosted.org/packages/cb/41/05fbf6944b098eb9d53e8a29a9dbfa20a7448f3254fb71499746a29a1b2d/numpy-1.17.1-cp37-cp37m-win_amd64.whl (12.8MB)|
...
Collecting six>=1.0.0 (from pyarrow)
Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Installing collected packages: numpy, six, pyarrow
WARNING: The script f2py.exe is installed in 'C:\pyarrow\python-3.7.4-embed-amd64\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: The script plasma_store.exe is installed in 'C:\pyarrow\python-3.7.4-embed-amd64\Scripts' which is not on PATH.
Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed numpy-1.17.1 pyarrow-0.14.0 six-1.12.0
> python -c "import pyarrow"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\pyarrow\python-3.7.4-embed-amd64\lib\site-packages\pyarrow__init__.py", line 49, in <module>
from pyarrow.lib import cpu_count, set_cpu_count
ImportError: DLL load failed: The specified module could not be found.
> python -m pip freeze
numpy==1.17.1
pyarrow==0.14.0
six==1.12.0
> dir Lib\site-packages\pyarrow
Volume in drive C is OS
Volume Serial Number is 1234-5678|
Directory of C:\pyarrow\python-3.7.4-embed-amd64\Lib\site-packages\pyarrow
08/31/2019 05:42 AM <DIR> .
08/31/2019 05:42 AM <DIR> ..
08/31/2019 05:42 AM 47,658 array.pxi
08/31/2019 05:42 AM 5,748,736 arrow.dll
08/31/2019 05:42 AM 1,653,120 arrow.lib
08/31/2019 05:42 AM 1,795,072 arrow_flight.dll
08/31/2019 05:42 AM 121,062 arrow_flight.lib
08/31/2019 05:42 AM 910,848 arrow_python.dll
08/31/2019 05:42 AM 119,994 arrow_python.lib
08/31/2019 05:42 AM 869 benchmark.pxi
08/31/2019 05:42 AM 895 benchmark.py
08/31/2019 05:42 AM 2,774 builder.pxi
08/31/2019 05:42 AM 81,920 cares.dll
08/31/2019 05:42 AM 3,691 compat.py
08/31/2019 05:42 AM 911 csv.py
08/31/2019 05:42 AM 1,126 cuda.py
08/31/2019 05:42 AM 3,161 error.pxi
08/31/2019 05:42 AM 4,026 feather.pxi
08/31/2019 05:42 AM 7,291 feather.py
08/31/2019 05:42 AM 12,472 filesystem.py
08/31/2019 05:42 AM 1,286 flight.py
08/31/2019 05:42 AM 186,880 gandiva.cp37-win_amd64.pyd
08/31/2019 05:42 AM 791,664 gandiva.cpp
08/31/2019 05:42 AM 22,094,848 gandiva.dll
08/31/2019 05:42 AM 305,626 gandiva.lib
08/31/2019 05:42 AM 16,553 gandiva.pyx
08/31/2019 05:42 AM 7,032 hdfs.py
08/31/2019 05:42 AM <DIR> include
08/31/2019 05:42 AM <DIR> includes
08/31/2019 05:42 AM 13,995 io-hdfs.pxi
08/31/2019 05:42 AM 48,879 io.pxi
08/31/2019 05:42 AM 15,981 ipc.pxi
08/31/2019 05:42 AM 6,178 ipc.py
08/31/2019 05:42 AM 897 json.py
08/31/2019 05:42 AM 8,623 jvm.py
08/31/2019 05:42 AM 1,553,408 lib.cp37-win_amd64.pyd
08/31/2019 05:42 AM 6,756,155 lib.cpp
08/31/2019 05:42 AM 10,652 lib.pxd
08/31/2019 05:42 AM 3,570 lib.pyx
08/31/2019 05:42 AM 3,243,008 libcrypto-1_1-x64.dll
08/31/2019 05:42 AM 2,613,248 libprotobuf.dll
08/31/2019 05:42 AM 650,240 libssl-1_1-x64.dll
08/31/2019 05:42 AM 13,435 lib_api.h
08/31/2019 05:42 AM 4,724 memory.pxi
08/31/2019 05:42 AM 4,912 orc.py
08/31/2019 05:42 AM 5,789 pandas-shim.pxi
08/31/2019 05:42 AM 33,456 pandas_compat.py
08/31/2019 05:42 AM 1,789,952 parquet.dll
08/31/2019 05:42 AM 346,864 parquet.lib
08/31/2019 05:42 AM 52,331 parquet.py
08/31/2019 05:42 AM 5,780 plasma.py
08/31/2019 05:42 AM 8,778 public-api.pxi
08/31/2019 05:42 AM 23,060 scalar.pxi
08/31/2019 05:42 AM 15,427 serialization.pxi
08/31/2019 05:42 AM 12,588 serialization.py
08/31/2019 05:42 AM 46,760 table.pxi
08/31/2019 05:42 AM <DIR> tensorflow
08/31/2019 05:42 AM <DIR> tests
08/31/2019 05:42 AM 48,149 types.pxi
08/31/2019 05:42 AM 6,609 types.py
08/31/2019 05:42 AM 3,549 util.py
08/31/2019 05:42 AM 89,600 zlib.dll
08/31/2019 05:42 AM 106,496 _csv.cp37-win_amd64.pyd
08/31/2019 05:42 AM 493,978 _csv.cpp
08/31/2019 05:42 AM 14,861 _csv.pyx
08/31/2019 05:42 AM 1,934 _cuda.pxd
08/31/2019 05:42 AM 34,567 _cuda.pyx
08/31/2019 05:42 AM 346,112 _flight.cp37-win_amd64.pyd
08/31/2019 05:42 AM 1,518,124 _flight.cpp
08/31/2019 05:42 AM 46,504 _flight.pyx
08/31/2019 05:42 AM 121 _generated_version.py
08/31/2019 05:42 AM 52,736 _json.cp37-win_amd64.pyd
08/31/2019 05:42 AM 311,759 _json.cpp
08/31/2019 05:42 AM 6,413 _json.pyx
08/31/2019 05:42 AM 2,156 _orc.pxd
08/31/2019 05:42 AM 3,670 _orc.pyx
08/31/2019 05:42 AM 281,600 _parquet.cp37-win_amd64.pyd
08/31/2019 05:42 AM 1,352,623 _parquet.cpp
08/31/2019 05:42 AM 17,061 _parquet.pxd
08/31/2019 05:42 AM 44,057 _parquet.pyx
08/31/2019 05:42 AM 27,524 _plasma.pyx
08/31/2019 05:42 AM 1,749 __init__.pxd
08/31/2019 05:42 AM 10,564 __init__.py
08/31/2019 05:42 AM <DIR> __pycache__
77 File(s) 56,030,721 bytes
7 Dir(s) 22,981,132,288 bytes free
// The following two steps installs Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019
> bitsadmin /TRANSFER htmlget https://aka.ms/vs/16/release/vc_redist.x64.exe c:\pyarrow\vc_redist.x64.exe
> ..\vc_redist.x64.exe
> python -c "import pyarrow"
> python -c "import pyarrow ; print (pyarrow.cpu_count())"
4
> python -m pip freeze
numpy==1.17.1
pyarrow==0.14.0
six==1.12.0
> dir Lib\site-packages\pyarrow
Volume in drive C is OS
Volume Serial Number is 1234-5678
Directory of C:\pyarrow\python-3.7.4-embed-amd64\Lib\site-packages\pyarrow
08/31/2019 05:42 AM <DIR> .
08/31/2019 05:42 AM <DIR> ..
08/31/2019 05:42 AM 47,658 array.pxi
08/31/2019 05:42 AM 5,748,736 arrow.dll
08/31/2019 05:42 AM 1,653,120 arrow.lib
08/31/2019 05:42 AM 1,795,072 arrow_flight.dll
08/31/2019 05:42 AM 121,062 arrow_flight.lib
08/31/2019 05:42 AM 910,848 arrow_python.dll
08/31/2019 05:42 AM 119,994 arrow_python.lib
08/31/2019 05:42 AM 869 benchmark.pxi
08/31/2019 05:42 AM 895 benchmark.py
08/31/2019 05:42 AM 2,774 builder.pxi
08/31/2019 05:42 AM 81,920 cares.dll
08/31/2019 05:42 AM 3,691 compat.py
08/31/2019 05:42 AM 911 csv.py
08/31/2019 05:42 AM 1,126 cuda.py
08/31/2019 05:42 AM 3,161 error.pxi
08/31/2019 05:42 AM 4,026 feather.pxi
08/31/2019 05:42 AM 7,291 feather.py
08/31/2019 05:42 AM 12,472 filesystem.py
08/31/2019 05:42 AM 1,286 flight.py
08/31/2019 05:42 AM 186,880 gandiva.cp37-win_amd64.pyd
08/31/2019 05:42 AM 791,664 gandiva.cpp
08/31/2019 05:42 AM 22,094,848 gandiva.dll
08/31/2019 05:42 AM 305,626 gandiva.lib
08/31/2019 05:42 AM 16,553 gandiva.pyx
08/31/2019 05:42 AM 7,032 hdfs.py
08/31/2019 05:42 AM <DIR> include
08/31/2019 05:42 AM <DIR> includes
08/31/2019 05:42 AM 13,995 io-hdfs.pxi
08/31/2019 05:42 AM 48,879 io.pxi
08/31/2019 05:42 AM 15,981 ipc.pxi
08/31/2019 05:42 AM 6,178 ipc.py
08/31/2019 05:42 AM 897 json.py
08/31/2019 05:42 AM 8,623 jvm.py
08/31/2019 05:42 AM 1,553,408 lib.cp37-win_amd64.pyd
08/31/2019 05:42 AM 6,756,155 lib.cpp
08/31/2019 05:42 AM 10,652 lib.pxd
08/31/2019 05:42 AM 3,570 lib.pyx
08/31/2019 05:42 AM 3,243,008 libcrypto-1_1-x64.dll
08/31/2019 05:42 AM 2,613,248 libprotobuf.dll
08/31/2019 05:42 AM 650,240 libssl-1_1-x64.dll
08/31/2019 05:42 AM 13,435 lib_api.h
08/31/2019 05:42 AM 4,724 memory.pxi
08/31/2019 05:42 AM 4,912 orc.py
08/31/2019 05:42 AM 5,789 pandas-shim.pxi
08/31/2019 05:42 AM 33,456 pandas_compat.py
08/31/2019 05:42 AM 1,789,952 parquet.dll
08/31/2019 05:42 AM 346,864 parquet.lib
08/31/2019 05:42 AM 52,331 parquet.py
08/31/2019 05:42 AM 5,780 plasma.py
08/31/2019 05:42 AM 8,778 public-api.pxi
08/31/2019 05:42 AM 23,060 scalar.pxi
08/31/2019 05:42 AM 15,427 serialization.pxi
08/31/2019 05:42 AM 12,588 serialization.py
08/31/2019 05:42 AM 46,760 table.pxi
08/31/2019 05:42 AM <DIR> tensorflow
08/31/2019 05:42 AM <DIR> tests
08/31/2019 05:42 AM 48,149 types.pxi
08/31/2019 05:42 AM 6,609 types.py
08/31/2019 05:42 AM 3,549 util.py
08/31/2019 05:42 AM 89,600 zlib.dll
08/31/2019 05:42 AM 106,496 _csv.cp37-win_amd64.pyd
08/31/2019 05:42 AM 493,978 _csv.cpp
08/31/2019 05:42 AM 14,861 _csv.pyx
08/31/2019 05:42 AM 1,934 _cuda.pxd
08/31/2019 05:42 AM 34,567 _cuda.pyx
08/31/2019 05:42 AM 346,112 _flight.cp37-win_amd64.pyd
08/31/2019 05:42 AM 1,518,124 _flight.cpp
08/31/2019 05:42 AM 46,504 _flight.pyx
08/31/2019 05:42 AM 121 _generated_version.py
08/31/2019 05:42 AM 52,736 _json.cp37-win_amd64.pyd
08/31/2019 05:42 AM 311,759 _json.cpp
08/31/2019 05:42 AM 6,413 _json.pyx
08/31/2019 05:42 AM 2,156 _orc.pxd
08/31/2019 05:42 AM 3,670 _orc.pyx
08/31/2019 05:42 AM 281,600 _parquet.cp37-win_amd64.pyd
08/31/2019 05:42 AM 1,352,623 _parquet.cpp
08/31/2019 05:42 AM 17,061 _parquet.pxd
08/31/2019 05:42 AM 44,057 _parquet.pyx
08/31/2019 05:42 AM 27,524 _plasma.pyx
08/31/2019 05:42 AM 1,749 __init__.pxd
08/31/2019 05:42 AM 10,564 __init__.py
08/31/2019 05:42 AM <DIR> __pycache__
77 File(s) 56,030,721 bytes
7 Dir(s) 22,936,629,248 bytes free
>
Kazuaki Ishizaki / @kiszk: If I need to test other configurations, I can test them.
Antoine Pitrou / @pitrou: @kiszk How do you know this would fix the issue, if you didn't manage to reproduce it before?
Kazuaki Ishizaki / @kiszk: At first, I cannot reproduce this issue on my Windows 10 notebook where I have installed multiple applications.
Then, when I prepared a new fresh Windows 10 instance where I have not installed any software and ran pyarrow
on it, I noticed that I can reproduce this issue.
After I installed some pip modules and installed utility software to the instance, I realized that this issue does not occur.
Next, I created another Windows 10 instance to investigate what step can avoid this issue. I checked step by step again. As a result, I found that installing Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019
can avoid this issue.
Is this an answer to your question?
Antoine Pitrou / @pitrou:
Is this an answer to your question?
Of course, thank you. So it seems the fix (if someone wants to fix the issue and produce reliable Windows wheels for PyArrow) should be to bundle the CRT DLLs with the wheel. It used to be simple as there were just two such DLLs (msvcrt
and msvcp
). Nowadays, there are several of them.
Alternatively, just mention that people have to install the redistributables separately. They're useful for a ton of other software anyway.
Kazuaki Ishizaki / @kiszk: I agree with you. There are two possible solutions:
msv*dll
) Ask users to install the redistributables by writing it in the document
I think that Miniconda3 takes the first solution. Since I can find msvcp140*.dll
under multiple directories in the miniconda directory.
If we will take the first solution, I will be able to identify which DLLs are required.
Wes McKinney / @wesm: I don't know that installing the redistributable runtime is an acceptable solution. Isn't the CRT distributed with Python 3.7 on Windows? I am curious how other Python wheels (e.g. PyTorch, TensorFlow) address this issue. It might be worth asking them.
Potentially there is a problem with our build environment that is introducing a dependency on a version of the CRT newer than the one distributed with Python from python.org
Antoine Pitrou / @pitrou:
Isn't the CRT distributed with Python 3.7 on Windows?
I don't know. Since the issue popped up on Windows 10 I suppose something is missing. I see two possible explanations:
libstdc++
).Wes McKinney / @wesm: I see, it's plausible then that something is missing then. I took a brief look at the TensorFlow wheels and it looks like they are statically linking everything (including probably the CRT) in a single pyd file
Antoine Pitrou / @pitrou: I don't know. I thought we were bundling the CRT with Numba or llvmlite wheels but I see that's not the case and I cannot find the trace of it in past versions. So perhaps I was mistaken.
The most annoying problem here is probably the obscure error message. If only Microsoft invested a tiny bit of their revenue to improve quality of life for end users and third-party developers...
Antoine Pitrou / @pitrou: Historical note: apparently I originally encountered the issue with Conda packages, not Python wheels. That wouldn't happen today anymore :) . See https://github.com/ContinuumIO/anaconda-issues/issues/202
Wes McKinney / @wesm: I see, it makes sense then to bundle the DLLs.
Kazuaki Ishizaki / @kiszk: As far as I know, the CRT is not distributed with Python 3.7 on Windows.
> cd \python-3.7.4-embed-amd64
> \cygwin64\bin\find . -name "msv*"
./Lib/site-packages/numpy/distutils/msvc9compiler.py
./Lib/site-packages/numpy/distutils/msvccompiler.py
./Lib/site-packages/numpy/distutils/__pycache__/msvc9compiler.cpython-37.pyc
./Lib/site-packages/numpy/distutils/__pycache__/msvccompiler.cpython-37.pyc
./Lib/site-packages/setuptools/msvc.py
./Lib/site-packages/setuptools/__pycache__/msvc.cpython-37.pyc
> cd \ProgramData\Miniconda3
>\cygwin64\bin\find . -name "msv*"
./Lib/distutils/msvc9compiler.py
./Lib/distutils/msvccompiler.py
./Lib/distutils/__pycache__/msvc9compiler.cpython-37.pyc
./Lib/distutils/__pycache__/msvccompiler.cpython-37.pyc
./Lib/site-packages/setuptools/msvc.py
./Lib/site-packages/setuptools/__pycache__/msvc.cpython-37.pyc
./Library/bin/msvcp140.dll
./Library/bin/msvcp140_1.dll
./Library/bin/msvcp140_2.dll
./msvcp140.dll
./msvcp140_1.dll
./msvcp140_2.dll
./pkgs/python-3.7.3-h8c8aaf0_1/Lib/distutils/msvc9compiler.py
./pkgs/python-3.7.3-h8c8aaf0_1/Lib/distutils/msvccompiler.py
./pkgs/python-3.7.3-h8c8aaf0_1/Lib/distutils/__pycache__/msvc9compiler.cpython-37.pyc
./pkgs/python-3.7.3-h8c8aaf0_1/Lib/distutils/__pycache__/msvccompiler.cpython-37.pyc
./pkgs/setuptools-41.0.1-py37_0/Lib/site-packages/setuptools/msvc.py
./pkgs/setuptools-41.0.1-py37_0/Lib/site-packages/setuptools/__pycache__/msvc.cpython-37.pyc
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/Library/bin/msvcp140.dll
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/Library/bin/msvcp140_1.dll
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/Library/bin/msvcp140_2.dll
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/msvcp140.dll
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/msvcp140_1.dll
./pkgs/vs2015_runtime-14.15.26706-h3a45250_4/msvcp140_2.dll
In addition to that, I remember that conda automatically installs vs2015_runtime
package, too although I am not sure which package has a dependency on vs2015_runtime
package.
> conda install pyarrow -c conda-forge
...
The following NEW packages will be INSTALLED:
arrow-cpp conda-forge/win-64::arrow-cpp-0.14.1-py37h1b0c03e_0
boost-cpp conda-forge/win-64::boost-cpp-1.70.0-h6a4c333_2
brotli conda-forge/win-64::brotli-1.0.7-he025d50_1000
c-ares conda-forge/win-64::c-ares-1.15.0-h2fa13f4_1001
ca-certificates conda-forge/win-64::ca-certificates-2019.6.16-hecc5488_0
certifi conda-forge/win-64::certifi-2019.6.16-py37_1
double-conversion conda-forge/win-64::double-conversion-3.1.5-h6538335_1
gflags conda-forge/win-64::gflags-2.2.2-he025d50_1001
glog conda-forge/win-64::glog-0.4.0-he025d50_1
grpc-cpp conda-forge/win-64::grpc-cpp-1.23.0-h4d7d3fa_0
intel-openmp pkgs/main/win-64::intel-openmp-2019.4-245
libblas conda-forge/win-64::libblas-3.8.0-12_mkl
libcblas conda-forge/win-64::libcblas-3.8.0-12_mkl
liblapack conda-forge/win-64::liblapack-3.8.0-12_mkl
libprotobuf conda-forge/win-64::libprotobuf-3.8.0-h1a1b453_0
lz4-c conda-forge/win-64::lz4-c-1.8.3-he025d50_1001
mkl pkgs/main/win-64::mkl-2019.4-245
numpy conda-forge/win-64::numpy-1.17.1-py37hc71023c_0
openssl conda-forge/win-64::openssl-1.1.1c-hfa6e2cd_0
pandas conda-forge/win-64::pandas-0.25.1-py37he350917_0
parquet-cpp conda-forge/noarch::parquet-cpp-1.5.1-2
pip conda-forge/win-64::pip-19.2.3-py37_0
pyarrow conda-forge/win-64::pyarrow-0.14.1-py37h803c963_0
python conda-forge/win-64::python-3.7.3-h510b542_1
python-dateutil conda-forge/noarch::python-dateutil-2.8.0-py_0
pytz conda-forge/noarch::pytz-2019.2-py_0
re2 conda-forge/win-64::re2-2019.08.01-vc14h6538335_0
setuptools conda-forge/win-64::setuptools-41.2.0-py37_0
six conda-forge/win-64::six-1.12.0-py37_1000
snappy conda-forge/win-64::snappy-1.1.7-h6538335_1002
sqlite conda-forge/win-64::sqlite-3.29.0-hfa6e2cd_1
thrift-cpp conda-forge/win-64::thrift-cpp-0.12.0-hd042d19_1004
uriparser conda-forge/win-64::uriparser-0.9.3-he025d50_1
vc pkgs/main/win-64::vc-14.1-h0510ff6_4
vs2015_runtime pkgs/main/win-64::vs2015_runtime-14.15.26706-h3a45250_4
wheel conda-forge/win-64::wheel-0.33.6-py37_0
wincertstore conda-forge/win-64::wincertstore-0.2-py37_1002
xz conda-forge/win-64::xz-5.2.4-h2fa13f4_1001
zlib conda-forge/win-64::zlib-1.2.11-h2fa13f4_1005
zstd conda-forge/win-64::zstd-1.4.0-hd8a0e53_0
...
Kazuaki Ishizaki / @kiszk:
I think that numba
(preciously llvmlite
) has the same problem regarding the CRT DLLs.
A DLL load error occurs before executing vc_redist.x64.exe
. The error disappear after executing vc_redist.x64.exe
.
> cd \numba\python-3.7.4-embed-amd64
> python -m pip install numba
Collecting numba
...
Installing collected packages: llvmlite, numba
Successfully installed llvmlite-0.29.0 numba-0.45.1
> \cygwin64\bin\find . -name "msv*"
./Lib/site-packages/numpy/distutils/msvc9compiler.py
./Lib/site-packages/numpy/distutils/msvccompiler.py
./Lib/site-packages/numpy/distutils/__pycache__/msvc9compiler.cpython-37.pyc
./Lib/site-packages/numpy/distutils/__pycache__/msvccompiler.cpython-37.pyc
./Lib/site-packages/setuptools/msvc.py
./Lib/site-packages/setuptools/__pycache__/msvc.cpython-37.pyc
> python -c "import numpy"
> \cygwin64\bin\find . -name "msv*"
./Lib/site-packages/numpy/distutils/msvc9compiler.py
./Lib/site-packages/numpy/distutils/msvccompiler.py
./Lib/site-packages/numpy/distutils/__pycache__/msvc9compiler.cpython-37.pyc
./Lib/site-packages/numpy/distutils/__pycache__/msvccompiler.cpython-37.pyc
./Lib/site-packages/setuptools/msvc.py
./Lib/site-packages/setuptools/__pycache__/msvc.cpython-37.pyc
> type Lib\site-packages\numpy\LICENSE.txt
...
Name: Microsoft Visual C++ Runtime Files
Files: extra-dll\msvcp140.dll
License: MSVC
https://www.visualstudio.com/license-terms/distributable-code-microsoft-visual-studio-2015-rc-microsoft-visual-studio-2015-sdk-rc-includes-utilities-buildserver-files/#visual-c-runtime
Subject to the License Terms for the software, you may copy and
distribute with your program any of the files within the followng
folder and its subfolders except as noted below. You may not modify
these files.
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist
You may not distribute the contents of the following folders:
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\debug_nonredist
C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\redist\onecore\debug_nonredist
Subject to the License Terms for the software, you may copy and
distribute the following files with your program in your programΓÇÖs
application local folder or by deploying them into the Global
Assembly Cache (GAC):
VC\atlmfc\lib\mfcmifc80.dll
VC\atlmfc\lib\amd64\mfcmifc80.dll
Name: Microsoft Visual C++ Runtime Files
Files: extra-dll\msvc*90.dll, extra-dll\Microsoft.VC90.CRT.manifest
License: MSVC
For your convenience, we have provided the following folders for
use when redistributing VC++ runtime files. Subject to the license
terms for the software, you may redistribute the folder
(unmodified) in the application local folder as a sub-folder with
no change to the folder name. You may also redistribute all the
files (*.dll and *.manifest) within a folder, listed below the
folder for your convenience, as an entire set.
\VC\redist\x86\Microsoft.VC90.ATL\
atl90.dll
Microsoft.VC90.ATL.manifest
\VC\redist\ia64\Microsoft.VC90.ATL\
atl90.dll
Microsoft.VC90.ATL.manifest
\VC\redist\amd64\Microsoft.VC90.ATL\
atl90.dll
Microsoft.VC90.ATL.manifest
\VC\redist\x86\Microsoft.VC90.CRT\
msvcm90.dll
msvcp90.dll
msvcr90.dll
Microsoft.VC90.CRT.manifest
\VC\redist\ia64\Microsoft.VC90.CRT\
msvcm90.dll
msvcp90.dll
msvcr90.dll
Microsoft.VC90.CRT.manifest
...
> python -c "import numba"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\numba\__init__.py", line 15, in <module>
from . import config, errors, _runtests as runtests, types
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\numba\config.py", line 18, in <module>
import llvmlite.binding as ll
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\__init__.py", line 6, in <module>
from .dylib import *
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\dylib.py", line 4, in <module>
from . import ffi
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\ffi.py", line 154, in <module>
raise OSError("Could not load shared object file: {}".format(_lib_name))
OSError: Could not load shared object file: llvmlite.dll
> python -c "llvmlite.binding as ll"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\__init__.py", line 6, in <module>
from .dylib import *
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\dylib.py", line 4, in <module>
from . import ffi
File "C:\numba\python-3.7.4-embed-amd64\lib\site-packages\llvmlite\binding\ffi.py", line 154, in <module>
raise OSError("Could not load shared object file: {}".format(_lib_name))
OSError: Could not load shared object file: llvmlite.dll
> >bitsadmin /TRANSFER htmlget https://aka.ms/vs/16/release/vc_redist.x64.exe c:\numba\vc_redist.x64.exe
> ../vc_redist.x64.exe
> python -c "import llvmlite.binding as ll"
> python -c "import numba"
>
Wes McKinney / @wesm: OK, let's resolve this issue by adding documentation about installing Visual C++ Redistributable?
Kazuaki Ishizaki / @kiszk: IIUC, there is already a paragraph to suggest installing Visual C++ Redistribute at here? Do we need to update the link and add an example for a failure?
If you encounter any importing issues of the pip wheels on Windows, you may need to install the Visual C++ Redistributable for Visual Studio 2015.
Finally (beyond 0.15?), is it the best solution to bundle the DLLs? Since I am studying how to package whl in windows for the release, it takes a time for me to find a way to bundle the DLLs. If there is a document, it helps me.
Wes McKinney / @wesm: In the short term we need to add documentation here
https://github.com/apache/arrow/blob/master/python/README.md
I'm submitting a PR
Wes McKinney / @wesm: I submitted a documentation PR and removed this from the release milestone for now
Kazuaki Ishizaki / @kiszk: I see. I will investigate how to bundle DLLs beyond 0.15.
Krisztian Szucs / @kszucs: This should be resolved by https://github.com/apache/arrow/pull/5404 We can confirm that it works during the 0.15 release verification.
When installing pyarrow 0.14.1 on windows 10 x64 with python 3.7, you get:
On 0.14.0 everything works fine.
Reporter: Ruslan Kuprieiev / @efiop Assignee: Krisztian Szucs / @kszucs
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-6015. Please see the migration documentation for further details.