aws / sagemaker-mxnet-inference-toolkit

Toolkit for allowing inference and serving with MXNet in SageMaker. Dockerfiles used for building SageMaker MXNet Containers are at https://github.com/aws/deep-learning-containers.
Apache License 2.0
28 stars 31 forks source link

libcuda.so.1 not found #112

Closed muhyun closed 4 years ago

muhyun commented 4 years ago

$ cat VERSION 1.3.3.dev0

I build an image using docker/1.6.0/py3/Dockerfile.gpu, and run the image. Then logged into the container.

root@89baff5fc5bd:/# pip list|grep mxnet
aws-mxnet-cu101mkl        1.6.0
keras-mxnet               2.2.4.1
mxnet-model-server        1.0.8
sagemaker-mxnet-inference 1.3.2
root@89baff5fc5bd:/# python
Python 3.6.8 (default, Apr  2 2020, 09:19:59)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import mxnet as mx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/usr/local/lib/python3.6/site-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/usr/local/lib/python3.6/site-packages/mxnet/base.py", line 214, in <module>
    _LIB = _load_lib()
  File "/usr/local/lib/python3.6/site-packages/mxnet/base.py", line 205, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/usr/local/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcuda.so.1: cannot open shared object file: No such file or directory
>>>

The version

muhyun commented 4 years ago

using --gpus all option to start docker resolves this.