CUDA.jl won't install/run on Jetson Orin NX

oschulz commented 2 days ago

I just tried CUDA.jl on an NVIDIA Jetson Orin NX, but installing CUDA.jl results in errors and CUDA.jl doesn't work. This is with a fresh, "empty " Julla v1.10.4 install:

I'm using a .julia/environments/v1.10/LocalPreferences.toml with

[CUDA_Runtime_jll]
local = "true"
version = "local"

but add CUDA still results in a download of the 1.6 GB CUDA runtime artifact. So while the Jetson system comes with /usr/local/cuda-11.4/ preinstalled, I don't think CUDA.jl tries to use it, despite the LocalPreferences.toml.

Installing CUDA.jl results in lots of errors like

****NvRmMemInit failed**** error type: 196626

*** NvRmMemInit failed NvRmMemConstructor
NvRmMemInitNvmap failed with Permission denied
549: Memory Manager Not supported

****NvRmMemInit failed**** error type: 196626

*** NvRmMemInit failed NvRmMemConstructor
NvRmMemInitNvmap failed with Permission denied
549: Memory Manager Not supported

and using CUDA afterwards results in the same kind of errors. I've used Julia with CUDA successfully on a Jetson TX2, a Jetson Nano and a Jetson Xavier NX in the past, always worked out-of-the-box.

julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (aarch64-linux-gnu)
  CPU: 4 × ARMv8 Processor rev 1 (v8l)
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, generic)
Threads: 1 default, 0 interactive, 1 GC (on 4 virtual cores)

maleadt commented 2 days ago

I'm using a .julia/environments/v1.10/LocalPreferences.toml with
[CUDA_Runtime_jll]
local = "true"
version = "local"

That only works if CUDA_Runtime_jll is part of your environment, which is what the set_runtime_version! call does (adding it to the [extras] section).

You didn't mention which version of CUDA.jl you're using. The latest version should support Tegra devices out of the box, i.e., without the need for a local toolkit.

Installing CUDA.jl results in lots of errors like
****NvRmMemInit failed**** error type: 196626

Errors like this indicate that your user doesn't have sufficient permissions. At the very least, you need to be in the video group.

oschulz commented 2 days ago

That only works if CUDA_Runtime_jll is part of your environment,

Oops, my bad!

Errors like this indicate that your user doesn't have sufficient permissions. At the very least, you need to be in the video group.

Thanks for the hint, that was it.

oschulz commented 2 days ago

In general, would you recommend to use preinstalled CUDA, or the latest Julia-installed CUDA on such systems?

maleadt commented 2 days ago

would you recommend to use preinstalled CUDA, or the latest Julia-installed CUDA on such systems

I would always recommend using the Julia-installed one. That ensures both driver<->toolkit, as well as toolkit<->CUDA.jl compatibility. Here, for example, we should be able to use CUDA toolkit 11.8, instead of the 11.4 you provided.

oschulz commented 2 days ago

Thanks again Tim!

JuliaGPU / CUDA.jl

CUDA.jl won't install/run on Jetson Orin NX #2435