ethz-asl / nvblox_ros1

ROS1 wrappers for GPU-acceleration volumetric mapping with nvblox.
https://developer.nvidia.com/isaac-ros-gems
Apache License 2.0
58 stars 15 forks source link

Version cuda error in docker #15

Closed will-44 closed 5 months ago

will-44 commented 5 months ago

Hello !

I'm currently encountering an issue while trying to utilize your package within a Docker environment, specifically related to CUDA. The error message I'm receiving is as follows:

CUDA error = 35 at /root/nvblox_ws/src/nvblox_ros1/nvblox/nvblox/include/nvblox/core/internal/impl/unified_ptr_impl.h:48 'cudaMallocHost(&cuda_ptr, sizeof(T))'. Error string: CUDA driver version is insufficient for CUDA runtime version.

Upon checking the CUDA version in my Docker container, it appears to be 11.8:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

While my CUDA version on the laptop matches (also 11.8 and nvidia driver 520), I'm running Ubuntu 20.04 on an x86 architecture.

Do you have any idea about this issue ? Thank you very much for your help !

ctampier commented 5 months ago

+1 on this. The error message is thrown when running the example launchfile roslaunch nvblox_ros nvblox_ros_panopt.launch rviz:=true inside docker.

I also got a warning previously, when first running the container, just after it finished building it: WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available. Use the NVIDIA Container Toolkit to start this container with GPU support; see https://docs.nvidia.com/datacenter/cloud-native/ .

From which I understand the container doesn't have access to the NVIDIA driver. I have them, by the way, on the host machine and nvidia-smi works just fine there.

ctampier commented 5 months ago

As stated in the warning message, installing and configuring the NVIDIA Container Toolkit fixes the issue. You also need to add the options --runtime=nvidia --gpus all in the run command at the end of the run_docker.sh file.

will-44 commented 5 months ago

Thank you for your help ! It works now ! (I just had to remove --runtime=nvidia otherwise I had this error : docker: Error response from daemon: unknown or invalid runtime name: nvidia.)