apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

Import mxnet for mxnet-cu92 fails #17887

Open ghost opened 4 years ago

ghost commented 4 years ago

Description

CUDA 92 installed, and mxnet-cu92 using pip, however import mxnet fails In installing mxnet or mxnet-mkl, import works. I am using Python 3.8

Error Message

Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import mxnet Traceback (most recent call last): File "", line 1, in File "C:.env38\lib\site-packages\mxnet__init__.py", line 24, in from .context import Context, current_context, cpu, gpu, cpu_pinned File "C:.env38\lib\site-packages\mxnet\context.py", line 24, in from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass File "C:.env38\lib\site-packages\mxnet\base.py", line 213, in _LIB = _load_lib() File "C:.env38\lib\site-packages\mxnet\base.py", line 204, in _load_lib lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL) File "C:\Users\arnau\AppData\Local\Programs\Python\Python38\lib\ctypes__init.py", line 373, in init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'C:.env38\lib\site-packages\mxnet\libmxnet.dll' (or one of its dependencies). Try using the full path with constructor syntax.

To Reproduce

  1. Windows 10.
  2. Installed Python 3.8
  3. Installed mkl 2020
  4. Installed Cuda92 nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2018 NVIDIA Corporation Built on Tue_Jun_12_23:08:12_Central_Daylight_Time_2018 Cuda compilation tools, release 9.2, V9.2.148
  5. Installed mxnet: pip install mxnet-cu92

What have you tried to solve it?

  1. Other combinations: CUDA 101 with mxnet-cu101, CUDA 92 with mxnet-cu92mkl...
  2. Explicit PATH to the CUDA /bin directory

Environment

----------Python Info---------- Version : 3.8.2 Compiler : MSC v.1916 64 bit (AMD64) Build : ('tags/v3.8.2:7b3ab59', 'Feb 25 2020 23:03:10') Arch : ('64bit', 'WindowsPE') ------------Pip Info----------- Version : 20.0.2 Directory : C:.env38\lib\site-packages\pip ----------MXNet Info----------- Hashtag not found. Not installed from pre-built package. ----------System Info---------- Platform : Windows-10-10.0.18362-SP0 system : Windows node : MSI release : 10 version : 10.0.18362 ----------Hardware Info---------- machine : AMD64 processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel Name Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz

----------Network Test---------- Setting timeout: 10 Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0230 sec, LOAD: 0.7210 sec. Timing for GluonNLP GitHub: https://github.com/dmlc/gluon-nlp, DNS: 0.0000 sec, LOAD: 0.5110 sec. Timing for GluonNLP: http://gluon-nlp.mxnet.io, DNS: 0.0602 sec, LOAD: 0.4753 sec. Timing for D2L: http://d2l.ai, DNS: 0.0200 sec, LOAD: 0.1508 sec. Timing for D2L (zh-cn): http://zh.d2l.ai, DNS: 0.0229 sec, LOAD: 0.2308 sec. Timing for FashionMNIST: https://repo.mxnet.io/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0668 sec, LOAD: 0.4193 sec. Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0212 sec, LOAD: 0.4935 sec. Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0192 sec, LOAD: 0.1854 sec.

Thanks a lot for you help.

yajiedesign commented 4 years ago

I'm not sure about the version you installed. If the latest version is installed. Use depend to check that mxnet_xx.dll.xx is your gpu compute capability.

ghost commented 4 years ago

I have a GTX 1070. Actually it used to work. I did re-install and I am stuck... However, when I do Depends, there are in fact warning. Strange, if I load explicitly the CUDA DLLs, I can import mxnet:

from ctypes import*

mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\cudnn64_7.dll") mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\cublas64_92.dll") mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\cufft64_92.dll") mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\cusolver64_92.dll") mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\curand64_92.dll") mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\nvrtc64_92.dll")

mydll = cdll.LoadLibrary("C:.env38\lib\site-packages\mxnet\libmxnet.dll")

CUDA being in my environment path variable, I don't get why the DLLs are not found. Could it be because I use a Virtual Env? Does not make sense, right?

yajiedesign commented 4 years ago

Virtual Env Should not cause problems.You can print environment variables in a Virtual Env.

ghost commented 4 years ago

OK, I think I understand one think from Depends on the libmxnet.dll. The Libmxnet.dll is looking for CUFFT64_92.DLL, CUBLA... etc in the local site-packages/mxnet/ directory, instead of the CUDA /bin directory. I still don't get it as the CUDA /bin path is set in %path% If I copy the DLLs (CYBLAS64_92, CUFFT_64_92, CUSOLVER64_92, CURAND64_92, NVRTC64_92) import of mxnet succeed and little nd array test on GPU works. Any idea what may happen? Thank you.

ghost commented 4 years ago

Turning around an issue with PATH in Windows. Something really bugs me. The path to the CUDA DLLs is there, but the DLL does not load. If I copy exact the path, present in Python environment path, the dll loads...

(.env38) C:.env38\Scripts>python Python 3.8.2 (tags/v3.8.2:7b3ab59, Feb 25 2020, 23:03:10) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import os os.environ environ({'ALLUSERSPROFILE': 'C:\ProgramData', 'APPDATA': 'C:\Users\arnau\AppData\Roaming', 'CAMLIBS': 'C:\Program Files\darktable\lib\libgphoto2\2.5.23', [...] 'PATH': 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin;C:\.env38\Scripts;[...]}) from ctypes import* mydll = cdll.LoadLibrary("cudnn64_7.dll") Traceback (most recent call last): File "", line 1, in File "C:\Users\arnau\AppData\Local\Programs\Python\Python38\lib\ctypes__init__.py", line 451, in LoadLibrary return self._dlltype(name) File "C:\Users\arnau\AppData\Local\Programs\Python\Python38\lib\ctypes__init.py", line 373, in init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'cudnn64_7.dll' (or one of its dependencies). Try using the full path with constructor syntax. mydll = cdll.LoadLibrary("C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin\cudnn64_7.dll")

Xenos24R commented 4 years ago

I meet the same error,have u got it?

ghost commented 4 years ago

Hi Xenos24R, No, not really found a perfect solution. I use a virtual env, so I implemented a little .bat script and I basically copy the DLL to Python packages dir:

:begin
@echo off
@echo copy cuda libs
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cublas64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\curand64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cufft64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cusolver64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\nvrtc64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
@echo activate python env
call C:\.env38\Scripts\activate.bat

Not ideal, but functional! AL

Xenos24R commented 4 years ago

Hi Xenos24R, No, not really found a perfect solution. I use a virtual env, so I implemented a little .bat script and I basically copy the DLL to Python packages dir:

:begin
@echo off
@echo copy cuda libs
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cublas64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\curand64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cufft64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cusolver64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\nvrtc64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
@echo activate python env
call C:\.env38\Scripts\activate.bat

Not ideal, but functional! AL

Hi alinagithub, Thank you for your reply.I think the problem may have been caused by a file path error in some configuration files due to the installation of both GPU and CPU versions of MXNET,and I solved the problem by creating a virtual environment.

howff commented 4 years ago

The same problem exists when I install mxnet_cu102-1.6.0 into Python38 on Windows via the wheel mxnet_cu102-1.6.0-py2.py3-win_amd64.dll (if you can't get this from pip install then you can get it from dist.mxnet.io/python)

It gives an error "Could not find module C:\python38\lib\site-packages\mxnet\libmxnet.dll"

Manually copying the DLLs from the Program Files\NVIDIA GPU Toolkit\CUDA directory solves it.

nikky4D commented 4 years ago

@howff I am having the same issues. Can you tell me what is your setup on Windows? Also what is the pip install command you used to install mxnet for cuda 10.2

howff commented 4 years ago

@nikky4D I installed Python into c:\Python38 and installed CUDA 10.2 from nvidia then I downloaded the latest nightly version of mxnet from pip which actually turned out to be mxnet-1.6.0 listed here: https://dist.mxnet.io/python, the URL is https://repo.mxnet.io/dist/python/cu102/mxnet_cu102-1.6.0-py2.py3-none-win_amd64.whl. Download that and pip install it. If it gives an error when you try to use it that it cannot load the libmxnet.dll then it's usually caused by a missing dependency. I didn't try too hard to find out a proper solution but I proved that it can be made to work by manually copying the CUDA .dll files from Program Files into Python38 site-packages\mxnet (I don't have the VM running so can't tell you the exact paths)

nikky4D commented 4 years ago

@howff Thank you for the info. I'll try the manual copy and hopefully that will work.

Update: It worked. I copied all the .dll files in NVIDIA COMPUTING TOOLKIT/v10.2/bin to the env/lib/site-packages/mxne I did not know which one would be useful so I copied all of them

lleen12 commented 2 years ago

Hi Xenos24R, No, not really found a perfect solution. I use a virtual env, so I implemented a little .bat script and I basically copy the DLL to Python packages dir:

:begin
@echo off
@echo copy cuda libs
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cublas64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\curand64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cufft64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\cusolver64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
call copy /y "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v9.2\\bin\\nvrtc64_92.dll" "C:\\.env38\\Lib\\site-packages\\mxnet"
@echo activate python env
call C:\.env38\Scripts\activate.bat

Not ideal, but functional! AL

Hi alinagithub, I have the same question,but I can't slove it by your answers,can you help you?thank you very much! Here is my note:

Traceback (most recent call last): File "D:\PyCharm_Project\ML_data.py", line 23, in from mxnet import np File "D:\Program Files (x86)\python3.9.4\lib\site-packages\mxnet__init__.py", line 25, in from .context import Context, current_context, cpu, gpu, cpu_pinned File "D:\Program Files (x86)\python3.9.4\lib\site-packages\mxnet\context.py", line 23, in from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass File "D:\Program Files (x86)\python3.9.4\lib\site-packages\mxnet\base.py", line 351, in _LIB = _load_lib() File "D:\Program Files (x86)\python3.9.4\lib\site-packages\mxnet\base.py", line 342, in _load_lib lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL) File "D:\Program Files (x86)\python3.9.4\lib\ctypes__init.py", line 374, in init__ self._handle = _dlopen(self._name, mode) FileNotFoundError: Could not find module 'D:\Program Files (x86)\python3.9.4\lib\site-packages\mxnet\libmxnet.dll' (or one of its dependencies). Try using the full path with constructor syntax.

Process finished with exit code 1 @alinagithub