JuliaGPU / CUDA.jl

CUDA programming in Julia.
https://juliagpu.org/cuda/
Other
1.19k stars 214 forks source link

Support for Jetson Nano and TX2 NX (CUDA 10.2) #2147

Open stemann opened 10 months ago

stemann commented 10 months ago

WIP:

Miscellaneous notes Non-bug report regarding running tests on an unreasonable configuration:

This report is not completely fair as it relates to the last CUDA.jl version to currently support CUDA 10.2, v4.0.1 - on aarch64-linux-gnu (a Jetson Nano) - on Julia v1.10.0-beta3. **Describe the bug** Testing CUDA.jl v4.0.1 on aarch64-linux-gnu fails with stack: ``` ERROR: LoadError: AssertionError: llvmtype(decl) == llvmtype(entry) Stacktrace: [1] emit_function!(mod::LLVM.Module, job::GPUCompiler.CompilerJob, f::Type, method::GPUCompiler.Runtime.RuntimeMethodInstance; ctx::LLVM.ThreadSafeContext) @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:91 [2] build_runtime(job::GPUCompiler.CompilerJob; ctx::LLVM.ThreadSafeContext) @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:113 [3] build_runtime @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:98 [inlined] [4] (::GPUCompiler.var"#95#98"{LLVM.ThreadSafeContext, GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}}})() @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:167 [5] lock(f::GPUCompiler.var"#95#98"{LLVM.ThreadSafeContext, GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}}}, l::ReentrantLock) @ Base ./lock.jl:229 [6] macro expansion @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/rtlib.jl:127 [inlined] [7] load_runtime(job::GPUCompiler.CompilerJob; ctx::LLVM.ThreadSafeContext) @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:83 [8] load_runtime @ CUDA ~/.julia/packages/GPUCompiler/S3TWf/src/utils.jl:77 [inlined] [9] (::CUDA.var"#123#125"{Vector{VersionNumber}, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}})(ctx::LLVM.ThreadSafeContext) @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:21 [10] LLVM.ThreadSafeContext(f::CUDA.var"#123#125"{Vector{VersionNumber}, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{CUDA.var"#122#124", Tuple{}}}) @ LLVM ~/.julia/packages/LLVM/HykgZ/src/executionengine/ts_module.jl:14 [11] JuliaContext @ GPUCompiler ~/.julia/packages/GPUCompiler/S3TWf/src/driver.jl:74 [inlined] [12] precompile_runtime @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:15 [inlined] [13] precompile_runtime() @ CUDA ~/.julia/packages/CUDA/ZdCxS/src/device/runtime.jl:13 [14] top-level scope @ ~/.julia/packages/CUDA/ZdCxS/test/setup.jl:32 ``` GPUCompiler is at version 0.17.3. **GPUCompiler tests** Testing GPUCompiler fails on the same assertion. Testing GPUCompiler (at version 0.25.0) fails in a different way (see [first comment](#issuecomment-1792627068)). **To reproduce** The Minimal Working Example (MWE) for this bug: ```julia using Pkg Pkg.add(; name="CUDA", version=v"4.0.1") Pkg.test("CUDA") ```
Manifest.toml

``` # This file is machine-generated - editing it directly is not advised julia_version = "1.10.0-beta3" manifest_format = "2.0" project_hash = "3509c5bf235fb7c0326b865a545a502f318a7ac8" [[deps.AbstractFFTs]] deps = ["LinearAlgebra"] git-tree-sha1 = "d92ad398961a3ed262d8bf04a1a2b8340f915fef" uuid = "621f4979-c628-5d54-868e-fcf4e3e8185c" version = "1.5.0" [deps.AbstractFFTs.extensions] AbstractFFTsChainRulesCoreExt = "ChainRulesCore" AbstractFFTsTestExt = "Test" [deps.AbstractFFTs.weakdeps] ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [[deps.Adapt]] deps = ["LinearAlgebra", "Requires"] git-tree-sha1 = "02f731463748db57cc2ebfbd9fbc9ce8280d3433" uuid = "79e6a3ab-5dfb-504d-930d-738a2a938a0e" version = "3.7.1" [deps.Adapt.extensions] AdaptStaticArraysExt = "StaticArrays" [deps.Adapt.weakdeps] StaticArrays = "90137ffa-7385-5640-81b9-e52037218182" [[deps.ArgTools]] uuid = "0dad84c5-d112-42e6-8d28-ef12dabb789f" version = "1.1.1" [[deps.Artifacts]] uuid = "56f22d72-fd6d-98f1-02f0-08ddc0907c33" [[deps.BFloat16s]] deps = ["LinearAlgebra", "Printf", "Random", "Test"] git-tree-sha1 = "dbf84058d0a8cbbadee18d25cf606934b22d7c66" uuid = "ab4f0b2a-ad5b-11e8-123f-65d77653426b" version = "0.4.2" [[deps.Base64]] uuid = "2a0f44e3-6c83-55bd-87e4-b1978d98bd5f" [[deps.CEnum]] git-tree-sha1 = "eb4cb44a499229b3b8426dcfb5dd85333951ff90" uuid = "fa961155-64e5-5f13-b03f-caf6b980ea82" version = "0.4.2" [[deps.CUDA]] deps = ["AbstractFFTs", "Adapt", "BFloat16s", "CEnum", "CUDA_Driver_jll", "CUDA_Runtime_Discovery", "CUDA_Runtime_jll", "CompilerSupportLibraries_jll", "ExprTools", "GPUArrays", "GPUCompiler", "LLVM", "LazyArtifacts", "Libdl", "LinearAlgebra", "Logging", "Preferences", "Printf", "Random", "Random123", "RandomNumbers", "Reexport", "Requires", "SparseArrays", "SpecialFunctions"] git-tree-sha1 = "edff14c60784c8f7191a62a23b15a421185bc8a8" uuid = "052768ef-5323-5732-b1bb-66c8b64840ba" version = "4.0.1" [[deps.CUDA_Driver_jll]] deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "Pkg"] git-tree-sha1 = "75d7896d1ec079ef10d3aee8f3668c11354c03a1" uuid = "4ee394cb-3365-5eb0-8335-949819d2adfc" version = "0.2.0+0" [[deps.CUDA_Runtime_Discovery]] deps = ["Libdl"] git-tree-sha1 = "d6b227a1cfa63ae89cb969157c6789e36b7c9624" uuid = "1af6417a-86b4-443c-805f-a4643ffb695f" version = "0.1.2" [[deps.CUDA_Runtime_jll]] deps = ["Artifacts", "CUDA_Driver_jll", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"] git-tree-sha1 = "ed00f777d2454c45f5f49634ed0a589da07ee0b0" uuid = "76a88914-d11a-5bdc-97e0-2f5a05c973a2" version = "0.2.4+1" [[deps.CompilerSupportLibraries_jll]] deps = ["Artifacts", "Libdl"] uuid = "e66e0078-7015-5450-92f7-15fbd957f2ae" version = "1.0.5+1" [[deps.Dates]] deps = ["Printf"] uuid = "ade2ca70-3891-5945-98fb-dc099432e06a" [[deps.DocStringExtensions]] deps = ["LibGit2"] git-tree-sha1 = "2fb1e02f2b635d0845df5d7c167fec4dd739b00d" uuid = "ffbed154-4ef7-542d-bbb7-c09d3a79fcae" version = "0.9.3" [[deps.Downloads]] deps = ["ArgTools", "FileWatching", "LibCURL", "NetworkOptions"] uuid = "f43a241f-c20a-4ad4-852c-f6b1247861c6" version = "1.6.0" [[deps.ExprTools]] git-tree-sha1 = "27415f162e6028e81c72b82ef756bf321213b6ec" uuid = "e2ba6199-217a-4e67-a87a-7c52f15ade04" version = "0.1.10" [[deps.FileWatching]] uuid = "7b1f6079-737a-58dc-b8bc-7a2ca5c1b5ee" [[deps.GPUArrays]] deps = ["Adapt", "GPUArraysCore", "LLVM", "LinearAlgebra", "Printf", "Random", "Reexport", "Serialization", "Statistics"] git-tree-sha1 = "2e57b4a4f9cc15e85a24d603256fe08e527f48d1" uuid = "0c68f7d7-f131-5f86-a1c3-88cf8149b2d7" version = "8.8.1" [[deps.GPUArraysCore]] deps = ["Adapt"] git-tree-sha1 = "2d6ca471a6c7b536127afccfa7564b5b39227fe0" uuid = "46192b85-c4d5-4398-a991-12ede77f4527" version = "0.1.5" [[deps.GPUCompiler]] deps = ["ExprTools", "InteractiveUtils", "LLVM", "Libdl", "Logging", "TimerOutputs", "UUIDs"] git-tree-sha1 = "19d693666a304e8c371798f4900f7435558c7cde" uuid = "61eb1bfa-7361-4325-ad38-22787b887f55" version = "0.17.3" [[deps.InteractiveUtils]] deps = ["Markdown"] uuid = "b77e0a4c-d291-57a0-90e8-8db25a27a240" [[deps.IrrationalConstants]] git-tree-sha1 = "630b497eafcc20001bba38a4651b327dcfc491d2" uuid = "92d709cd-6900-40b7-9082-c6be49f344b6" version = "0.2.2" [[deps.JLLWrappers]] deps = ["Artifacts", "Preferences"] git-tree-sha1 = "7e5d6779a1e09a36db2a7b6cff50942a0a7d0fca" uuid = "692b3bcd-3c85-4b1f-b108-f13ce0eb3210" version = "1.5.0" [[deps.LLVM]] deps = ["CEnum", "LLVMExtra_jll", "Libdl", "Printf", "Unicode"] git-tree-sha1 = "f044a2796a9e18e0531b9b3072b0019a61f264bc" uuid = "929cbde3-209d-540e-8aea-75f648917ca0" version = "4.17.1" [[deps.LLVMExtra_jll]] deps = ["Artifacts", "JLLWrappers", "LazyArtifacts", "Libdl", "TOML"] git-tree-sha1 = "070e4b5b65827f82c16ae0916376cb47377aa1b5" uuid = "dad2f222-ce93-54a1-a47d-0025e8a3acab" version = "0.0.18+0" [[deps.LazyArtifacts]] deps = ["Artifacts", "Pkg"] uuid = "4af54fe1-eca0-43a8-85a7-787d91b784e3" [[deps.LibCURL]] deps = ["LibCURL_jll", "MozillaCACerts_jll"] uuid = "b27032c2-a3e7-50c8-80cd-2d36dbcbfd21" version = "0.6.4" [[deps.LibCURL_jll]] deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll", "Zlib_jll", "nghttp2_jll"] uuid = "deac9b47-8bc7-5906-a0fe-35ac56dc84c0" version = "8.0.1+1" [[deps.LibGit2]] deps = ["Base64", "LibGit2_jll", "NetworkOptions", "Printf", "SHA"] uuid = "76f85450-5226-5b5a-8eaa-529ad045b433" [[deps.LibGit2_jll]] deps = ["Artifacts", "LibSSH2_jll", "Libdl", "MbedTLS_jll"] uuid = "e37daf67-58a4-590a-8e99-b0245dd2ffc5" version = "1.6.4+0" [[deps.LibSSH2_jll]] deps = ["Artifacts", "Libdl", "MbedTLS_jll"] uuid = "29816b5a-b9ab-546f-933c-edad1886dfa8" version = "1.11.0+1" [[deps.Libdl]] uuid = "8f399da3-3557-5675-b5ff-fb832c97cbdb" [[deps.LinearAlgebra]] deps = ["Libdl", "OpenBLAS_jll", "libblastrampoline_jll"] uuid = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" [[deps.LogExpFunctions]] deps = ["DocStringExtensions", "IrrationalConstants", "LinearAlgebra"] git-tree-sha1 = "7d6dd4e9212aebaeed356de34ccf262a3cd415aa" uuid = "2ab3a3ac-af41-5b50-aa03-7779005ae688" version = "0.3.26" [deps.LogExpFunctions.extensions] LogExpFunctionsChainRulesCoreExt = "ChainRulesCore" LogExpFunctionsChangesOfVariablesExt = "ChangesOfVariables" LogExpFunctionsInverseFunctionsExt = "InverseFunctions" [deps.LogExpFunctions.weakdeps] ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" ChangesOfVariables = "9e997f8a-9a97-42d5-a9f1-ce6bfc15e2c0" InverseFunctions = "3587e190-3f89-42d0-90ee-14403ec27112" [[deps.Logging]] uuid = "56ddb016-857b-54e1-b83d-db4d58db5568" [[deps.Markdown]] deps = ["Base64"] uuid = "d6f4376e-aef5-505a-96c1-9c027394607a" [[deps.MbedTLS_jll]] deps = ["Artifacts", "Libdl"] uuid = "c8ffd9c3-330d-5841-b78e-0817d7145fa1" version = "2.28.2+1" [[deps.MozillaCACerts_jll]] uuid = "14a3606d-f60d-562e-9121-12d972cd8159" version = "2023.1.10" [[deps.NetworkOptions]] uuid = "ca575930-c2e3-43a9-ace4-1e988b2c1908" version = "1.2.0" [[deps.OpenBLAS_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "Libdl"] uuid = "4536629a-c528-5b80-bd46-f80d51c5b363" version = "0.3.23+2" [[deps.OpenLibm_jll]] deps = ["Artifacts", "Libdl"] uuid = "05823500-19ac-5b8b-9628-191a04bc5112" version = "0.8.1+2" [[deps.OpenSpecFun_jll]] deps = ["Artifacts", "CompilerSupportLibraries_jll", "JLLWrappers", "Libdl", "Pkg"] git-tree-sha1 = "13652491f6856acfd2db29360e1bbcd4565d04f1" uuid = "efe28fd5-8261-553b-a9e1-b2916fc3738e" version = "0.5.5+0" [[deps.Pkg]] deps = ["Artifacts", "Dates", "Downloads", "FileWatching", "LibGit2", "Libdl", "Logging", "Markdown", "Printf", "REPL", "Random", "SHA", "Serialization", "TOML", "Tar", "UUIDs", "p7zip_jll"] uuid = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f" version = "1.10.0" [[deps.Preferences]] deps = ["TOML"] git-tree-sha1 = "00805cd429dcb4870060ff49ef443486c262e38e" uuid = "21216c6a-2e73-6563-6e65-726566657250" version = "1.4.1" [[deps.Printf]] deps = ["Unicode"] uuid = "de0858da-6303-5e67-8744-51eddeeeb8d7" [[deps.REPL]] deps = ["InteractiveUtils", "Markdown", "Sockets", "Unicode"] uuid = "3fa0cd96-eef1-5676-8a61-b3b8758bbffb" [[deps.Random]] deps = ["SHA"] uuid = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" [[deps.Random123]] deps = ["Random", "RandomNumbers"] git-tree-sha1 = "552f30e847641591ba3f39fd1bed559b9deb0ef3" uuid = "74087812-796a-5b5d-8853-05524746bad3" version = "1.6.1" [[deps.RandomNumbers]] deps = ["Random", "Requires"] git-tree-sha1 = "043da614cc7e95c703498a491e2c21f58a2b8111" uuid = "e6cf234a-135c-5ec9-84dd-332b85af5143" version = "1.5.3" [[deps.Reexport]] git-tree-sha1 = "45e428421666073eab6f2da5c9d310d99bb12f9b" uuid = "189a3867-3050-52da-a836-e630ba90ab69" version = "1.2.2" [[deps.Requires]] deps = ["UUIDs"] git-tree-sha1 = "838a3a4188e2ded87a4f9f184b4b0d78a1e91cb7" uuid = "ae029012-a4dd-5104-9daa-d747884805df" version = "1.3.0" [[deps.SHA]] uuid = "ea8e919c-243c-51af-8825-aaa63cd721ce" version = "0.7.0" [[deps.Serialization]] uuid = "9e88b42a-f829-5b0c-bbe9-9e923198166b" [[deps.Sockets]] uuid = "6462fe0b-24de-5631-8697-dd941f90decc" [[deps.SparseArrays]] deps = ["Libdl", "LinearAlgebra", "Random", "Serialization", "SuiteSparse_jll"] uuid = "2f01184e-e22b-5df5-ae63-d93ebab69eaf" version = "1.10.0" [[deps.SpecialFunctions]] deps = ["IrrationalConstants", "LogExpFunctions", "OpenLibm_jll", "OpenSpecFun_jll"] git-tree-sha1 = "e2cfc4012a19088254b3950b85c3c1d8882d864d" uuid = "276daf66-3868-5448-9aa4-cd146d93841b" version = "2.3.1" [deps.SpecialFunctions.extensions] SpecialFunctionsChainRulesCoreExt = "ChainRulesCore" [deps.SpecialFunctions.weakdeps] ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4" [[deps.Statistics]] deps = ["LinearAlgebra", "SparseArrays"] uuid = "10745b16-79ce-11e8-11f9-7d13ad32a3b2" version = "1.10.0" [[deps.SuiteSparse_jll]] deps = ["Artifacts", "Libdl", "Pkg", "libblastrampoline_jll"] uuid = "bea87d4a-7f5b-5778-9afe-8cc45184846c" version = "7.2.0+1" [[deps.TOML]] deps = ["Dates"] uuid = "fa267f1f-6049-4f14-aa54-33bafae1ed76" version = "1.0.3" [[deps.Tar]] deps = ["ArgTools", "SHA"] uuid = "a4e569a6-e804-4fa4-b0f3-eef7a1d5b13e" version = "1.10.0" [[deps.Test]] deps = ["InteractiveUtils", "Logging", "Random", "Serialization"] uuid = "8dfed614-e22c-5e08-85e1-65c5234f0b40" [[deps.TimerOutputs]] deps = ["ExprTools", "Printf"] git-tree-sha1 = "f548a9e9c490030e545f72074a41edfd0e5bcdd7" uuid = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f" version = "0.5.23" [[deps.UUIDs]] deps = ["Random", "SHA"] uuid = "cf7118a7-6976-5b1a-9a39-7adc72f591a4" [[deps.Unicode]] uuid = "4ec0a83e-493e-50e2-b9ac-8f72acf5a8f5" [[deps.Zlib_jll]] deps = ["Libdl"] uuid = "83775a58-1f1d-513f-b197-d71354ab007a" version = "1.2.13+1" [[deps.libblastrampoline_jll]] deps = ["Artifacts", "Libdl"] uuid = "8e850b90-86db-534c-a0d3-1478176c7d93" version = "5.8.0+1" [[deps.nghttp2_jll]] deps = ["Artifacts", "Libdl"] uuid = "8e850ede-7688-5339-a07c-302acd2aaf8d" version = "1.52.0+1" [[deps.p7zip_jll]] deps = ["Artifacts", "Libdl"] uuid = "3f19e933-33d8-53b3-aaab-bd5110c3b7a0" version = "17.4.0+2" ```

**Expected behavior** Tests pass. **Version info** Details on Julia: ``` julia --threads=auto --eval 'using InteractiveUtils; versioninfo()' Julia Version 1.10.0-beta3 Commit 404750f8586 (2023-10-03 12:53 UTC) Build Info: Official https://julialang.org/ release Platform Info: OS: Linux (aarch64-linux-gnu) CPU: 4 × ARMv8 Processor rev 1 (v8l) WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-15.0.7 (ORCJIT, cortex-a57) Threads: 5 on 4 virtual cores ``` Details on CUDA: ``` CUDA runtime 10.2, artifact installation CUDA driver 10.2 Unknown NVIDIA driver Libraries: - CUBLAS: 10.2.2 - CURAND: 10.1.2 - CUFFT: 10.1.2 - CUSOLVER: 10.3.0 - CUSPARSE: 10.3.1 - CUPTI: 12.0.0 - NVML: missing Toolchain: - Julia: 1.10.0-beta3 - LLVM: 15.0.7 - PTX ISA support: 3.2, 4.0, 4.1, 4.2, 4.3, 5.0, 6.0, 6.1, 6.3, 6.4, 6.5 - Device capability support: sm_30, sm_32, sm_35, sm_37, sm_50, sm_52, sm_53, sm_60, sm_61, sm_62, sm_70, sm_72, sm_75 1 device: 0: NVIDIA Tegra X1 (sm_53, 42.234 MiB / 1.933 GiB available) ``` **Additional context** Add any other context about the problem here.
stemann commented 10 months ago

Testing GPUCompiler (at version 0.25.0) fails in a different way:

[deleted]
maleadt commented 10 months ago

The GPUCompiler failure is a test timeout, and doesn't point to an actual issue.

GPUCompiler is at version 0.17.3.

That's your problem. You're using a beta release of Julia, so you should expect to have to use the latest versions of GPUCompiler.jl and friends for compatibility. It's otherwise unrelated to CUDA.

stemann commented 10 months ago

Right - I was also mostly filing this issue to make a note of the status - if it was relevant to bringing CUDA 10.2 support to CUDA.jl v5.

stemann commented 10 months ago

On a side note - running with the local toolkit fails to find compute-sanitizer (because it's not there):

ihp@jetson-nano:~$ cat /tmp/tmp.ibNRS3tjPV/Project.toml 
[deps]
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"

[extras]
CUDA_Runtime_jll = "76a88914-d11a-5bdc-97e0-2f5a05c973a2"
ihp@jetson-nano:~$ cat /tmp/tmp.ibNRS3tjPV/LocalPreferences.toml 
[CUDA_Runtime_jll]
version = "local"

ihp@jetson-nano:~$ JULIA_DEBUG=CUDA_Runtime_Discovery julia --threads=auto --project=/tmp/tmp.ibNRS3tjPV
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.10.0-beta3 (2023-10-03)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using CUDA
┌ Debug: Looking for binary ptxas in no specific location
│   all_locations = String[]
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Did not find ptxas
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:170
┌ Debug: Looking for library cudart, no specific version, in no specific location
│   all_names =
│    1-element Vector{String}:
│     "libcudart.so"
│   all_locations = String[]
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:128
┌ Debug: Found libcudart.so at /usr/local/cuda-10.2/targets/aarch64-linux/lib
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:137
┌ Debug: Looking for CUDA toolkit via CUDA runtime library
│   path = "/usr/local/cuda-10.2/targets/aarch64-linux/lib/libcudart.so"
│   dir = "/usr/local/cuda-10.2/targets/aarch64-linux"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:311
┌ Debug: Looking for CUDA toolkit via default installation directories
│   dirs =
│    1-element Vector{String}:
│     "/usr/local/cuda-10.2"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:338
┌ Debug: Found CUDA toolkit at /usr/local/cuda-10.2/targets/aarch64-linux, /usr/local/cuda-10.2
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:344
┌ Debug: Looking for binary ptxas in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/ptxas at /usr/local/cuda-10.2/bin/ptxas
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary nvdisasm in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/nvdisasm at /usr/local/cuda-10.2/bin/nvdisasm
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary nvlink in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2
│   all_locations =
│    4-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Found /usr/local/cuda-10.2/bin/nvlink at /usr/local/cuda-10.2/bin/nvlink
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:162
┌ Debug: Looking for binary compute-sanitizer in /usr/local/cuda-10.2/targets/aarch64-linux or /usr/local/cuda-10.2 or /usr/local/cuda-10.2/extras/compute-sanitizer
│   all_locations =
│    6-element Vector{String}:
│     "/usr/local/cuda-10.2/targets/aarch64-linux"
│     "/usr/local/cuda-10.2/targets/aarch64-linux/bin"
│     "/usr/local/cuda-10.2"
│     "/usr/local/cuda-10.2/bin"
│     "/usr/local/cuda-10.2/extras/compute-sanitizer"
│     "/usr/local/cuda-10.2/extras/compute-sanitizer/bin"
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:156
┌ Debug: Did not find compute-sanitizer
└ @ CUDA_Runtime_Discovery ~/.julia/packages/CUDA_Runtime_Discovery/tnsVx/src/CUDA_Runtime_Discovery.jl:170
┌ Debug: Could not discover CUDA toolkit
│   exception =
│    Could not find binary 'compute-sanitizer' in your local CUDA installation.
...
maleadt commented 10 months ago

IIRC compute-sanitizer was only added in 11.0, it used to be memory-sanitizer. I guess we can make it optional again.

stemann commented 10 months ago

IIRC compute-sanitizer was only added in 11.0, it used to be memory-sanitizer. I guess we can make it optional again.

Right - there's neither a cuda-sanitizer to be found via APT, nor a file mentioned in the build log for aarch64-linux-gnu: https://buildkite.com/julialang/yggdrasil/builds/6373#018b9530-dfb8-4278-85e1-fe37fd85c0ec

maleadt commented 10 months ago

Sorry, the old tool was called cuda-memcheck. But I would just leave it out and make compute-sanitizer optional in CUDA_Discovery_jll (as well as the uses in CUDA.jl).

maleadt commented 8 months ago

CUDA.jl now uses CUDA_Runtime_jll@0.11, which includes support for CUDA 10.2 again. I'll leave the updates to CUDA.jl itself to somebody with such hardware, though.