Closed JTShuai closed 1 month ago
@JTShuai can you confirm you can use CUDA in another independent container like l4t-jetpack or l4t-pytorch, and without all the extra docker run flags you added like --privileged
/ ect
@JTShuai can you confirm you can use CUDA in another independent container like l4t-jetpack or l4t-pytorch, and without all the extra docker run flags you added like
--privileged
/ ect
Hi, I tried docker run --runtime nvidia -it --rm --network=host dustynv/l4t-pytorch:r35.4.1
and got the error:
docker: Error response from daemon: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: error adding seccomp filter rule for syscall clone3: permission denied: unknown.
@dusty-nv I just noticed the comments:
Container images are compatible with other minor versions of JetPack/L4T:
• L4T R32.7 containers can run on other versions of L4T R32.7 (JetPack 4.6+)
• L4T R35.x containers can run on other versions of L4T R35.x (JetPack 5.1+)
So I tired docker run --runtime nvidia -it --rm --network=host dustynv/l4t-pytorch:r32.7.1
, but got the same error.
Hi @JTShuai - had you recently done an apt upgrade? With that adding seccomp filter rule for syscall
error, it sounds like the same problem to this one:
https://forums.developer.nvidia.com/t/docker-containers-wont-run-after-recent-apt-get-upgrade/194369
Hi @JTShuai - had you recently done an apt upgrade? With that
adding seccomp filter rule for syscall
error, it sounds like the same problem to this one:https://forums.developer.nvidia.com/t/docker-containers-wont-run-after-recent-apt-get-upgrade/194369
Thanks for your help! I tried the following commands you wrote in #108
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install nvidia-docker2=2.8.0-1
Now, I can enter container with the command docker run --runtime nvidia -it --rm --network=host dustynv/l4t-pytorch:r32.7.1
, but I got a new error with pytorch:
root@tx2-4:/# python3
Python 3.6.9 (default, Mar 10 2023, 16:46:00)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 196, in <module>
_load_global_deps()
File "/usr/local/lib/python3.6/dist-packages/torch/__init__.py", line 149, in _load_global_deps
ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
File "/usr/lib/python3.6/ctypes/__init__.py", line 348, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory
>>>
@JTShuai on jetpack 4, CUDA/cuDNN/TensorRT are mounted into the containers from the host device when --runtime nvidia
is used. You should have that libcurand.so.10 under /usr/local/cuda/lib64
. If you keep having problems with this, I might recommend reflashing your SD card given all the issues you have with docker. Then try again after a fresh re-install without doing the apt upgrade.
@JTShuai on jetpack 4, CUDA/cuDNN/TensorRT are mounted into the containers from the host device when
--runtime nvidia
is used. You should have that libcurand.so.10 under/usr/local/cuda/lib64
. If you keep having problems with this, I might recommend reflashing your SD card given all the issues you have with docker. Then try again after a fresh re-install without doing the apt upgrade.
Hi, I manually downgraded the docker to docker.io=20.10.7-0ubuntu1~18.04.2
and containerd=1.5.2-0ubuntu1~18.04.3
. And I checked that the libcurand.so.10
is under /usr/local/cuda/lib64
.
Still having the same error, so I will try reflashing the SD card.
Problem solved after reflashing the TX2.
The build report from opencv shows it's built with cuda, but the
cv2.cuda.getCudaEnabledDeviceCount()
command gets an error.Environment
Problem reproduce
output from
cv2.getBuildInformation()
:cv2.cuda.getCudaEnabledDeviceCount()
: