NVIDIA / DALI

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
https://docs.nvidia.com/deeplearning/dali/user-guide/docs/index.html
Apache License 2.0
5.06k stars 615 forks source link

Dependency issues of libdali.so and backend_impl.so #1024

Open ruiyuanlu opened 5 years ago

ruiyuanlu commented 5 years ago

Hi, I met a strange issue when after compiling DALI from scratch on Ubuntu 18.04, Anaconda python 3.6.

I created a {build_root} for compile process, and no error occurred during compiling. But as long as I renamed or deleted the "{build_root}/dali/python/nvidia/dali/libdali.so" file, I would get the following error when excuting import:

python -c "import nvidia.dali"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/room/anaconda3/lib/python3.6/site-packages/nvidia/dali/__init__.py", line 17, in <module>
    from . import ops
  File "/home/room/anaconda3/lib/python3.6/site-packages/nvidia/dali/ops.py", line 19, in <module>
    from nvidia.dali import backend as b
  File "/home/room/anaconda3/lib/python3.6/site-packages/nvidia/dali/backend.py", line 15, in <module>
    from nvidia.dali.backend_impl import *
ImportError: libdali.so: cannot open shared object file: No such file or directory

The python site-packages dir has the correct libdali.so file, but it seems that "backend_impl.cpython-36m-x86_64-linux-gnu.so" can only recognize "libdali.so" in the {build_root}, not site-packages dir in python. I guess there might be some dependency issues here ?

Any tips?

JanuszL commented 5 years ago

Hi, Yes, the backend will have a path set to libdali.so located in your build directory. When we create a standalone wheel package we use bundle-wheel.sh, and it patches library RPATH. I think you should go through this script and check how you need to adjust the build process in your case (we do a couple of additional things there - like adding a hash value to library name to avoid collisions and adding dependent libraries to the actual wheel).

ruiyuanlu commented 5 years ago

@JanuszL Thanks for you reply. May be this can be added to the readme or installation guide? I think install from a standalone wheel might be a prefer way for many users, but it might make install process more complicated.

More specifically, 3 extra steps might be taken:

  1. The prefix might need to be set manually ("/usr", "/usr/local" etc.).

  2. patchelf need to be pre-installed which is required by bundle-wheel.sh.

  3. The wheel required bundle-wheel.sh should be built manually before calling bundle-wheel.sh.

JanuszL commented 5 years ago

@ruiyuanlu it is a good idea. I just hesitate to provide that many details in the documentation as the recommended way is to use docker image to build DALI. Basically, bundle-wheel.sh assumes that libraries are at a certain location so they can be bundled in the wheel later, Ubuntu patchelf doesn't work properly - one form manylinux is fine and probably there are other caveeats that I have missed. Also, most of the time libraries on the system are dynamic while DALI relays on them to be static (if cmake find_pacakge functions provide dynamic one then whl will not bundle it and wheel won't work). So I would say that the docker way, is tested and recommended by us. Still, I would love to see any external contribution improving the build process for other environments that docker we have prepared (a lot of different libraries and tools configurations and dependencies that are hard to support).