dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
MIT License
2.09k stars 435 forks source link

Missing CUDA shared object file #327

Open kaarmu opened 9 months ago

kaarmu commented 9 months ago

Hi,

I have a ZED Box (Orin NX 16GB, r35.3) and I'm trying to ./build.sh ros:noetic-ros-base zed torchvision but it fails.

I have found that it fails when I try to build with zed and torch packages. I run into the following problems.

./build.sh zed torch gives me a similar error as #325 (I also tried changing ONNX_VERSION without success). However, if I instead run ./build.sh torch zed then I get

testing PyTorch...
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 168, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvToolsExt.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 3, in <module>
    import torch
  File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 228, in <module>
    _load_global_deps()
  File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 189, in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
  File "/usr/local/lib/python3.8/dist-packages/torch/__init__.py", line 154, in _preload_cuda_deps
    raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path ['/test', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.8/dist-packages']

I did get this error a few weeks ago as well but didn't have time to look into it then.

You can find the full logs here: