JuliaGPU / CUDA.jl

CUDA programming in Julia.
https://juliagpu.org/cuda/
Other
1.2k stars 215 forks source link

CUDA precompile cannot find/load "cupti64_2024.2.1.dll" during precompilation (juliaup 1.10.4, Windows 11) #2466

Closed a-charbon closed 1 month ago

a-charbon commented 1 month ago

@charleskawczynski and I have been unable to resolve this problem and would appreciate additional input:

Describe the bug:

The CUDA.jl library fails to precompile even on a clean/from-scratch build of Julia 1.10.4 (via juliaup) on a Windows 11 machine, in an empty environment. The stated error is

"LoadError: InitError: could not load library
"C:\Users\username\.julia\artifacts\81e017d7fb180420054a3d5bcf9e3ca371e1eea1\bin\cupti64_2024.2.1.dll"
The specified module could not be found."

However, when I navigate to this directory, the file exists there and is available, with the exact same name. The issue looked similar to #670 and this comment and its associated/linked issues, but I have checked/ensured that full file permissions are available, and the issue still occurs. Have tried precompilation from different terminal types (powershell, command prompt, git bash), deleting artifacts/ and retrying ]instantiate, or when adding CUDA to environments that previously precompiled without it, or removing and readding, and the issue persists, even when totally removing julia and starting from scratch. Not sure if pertinent to this dll specifically or if the same would happen for dll's called after this one, but all within that directory show "read and execute" permissions enabled (as well as when full permissions are provided for the directory/bin itself and all of artifacts/ recursively).

To Reproduce:

  1. Total uninstall of julia and juliaup, removal of .julia/, full disk clean and chkdisk /r reset on windows PC for full reset
  2. PC has an NVIDIA driver installed (Driver version 31.0.15.5212 from April 2024 for CUDA 12.4.131, Driver Type: DCH, GPU: NVIDIA GeForce RTX 2070 with Max-Q Design)
  3. Install juliaup via default recommendation - command prompt > winget install julia -s msstore , which adds the 'release' channel (julia 1.10.4, windows 64-bit)
  4. Open blank/default 1.10 environment in julia (.julia\environments\v1.10\) from any local directory (like username/Documents) via >julia, and ]add CUDA, though this also occurs for any predefined environment Project/Manifest location
  5. ]instantiate (and/or ]precompile if no precompilation occurs, or using CUDA once CUDA and dependencies are downloaded) result in the same error message

CUDA 5.4.3 was added via ]add CUDA and all default associated dependencies from a blank environment. Full stack trace and list of downloaded dependencies will be pasted at bottom of description.

Julia Details: Output of versioninfo()

Julia Version 1.10.4
Commit 48d4fd4843 (2024-06-04 10:41 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 16 × Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores)

Details on CUDA: Cannot achieve generation of CUDA.versioninfo() as CUDA does not compile.

Stacktrace:

Precompiling CUDA...
ERROR: LoadError: InitError: could not load library "C:\Users\andre\.julia\artifacts\81e017d7fb180420054a3d5bcf9e3ca371e1eea1\bin\cupti64_2024.2.1.dll"
The specified module could not be found.
Stacktrace:
  [1] dlopen(s::String, flags::UInt32; throw_error::Bool)
    @ Base.Libc.Libdl .\libdl.jl:117
  [2] dlopen(s::String, flags::UInt32)
    @ Base.Libc.Libdl .\libdl.jl:116
  [3] macro expansion
    @ C:\Users\andre\.julia\packages\JLLWrappers\pG9bm\src\products\library_generators.jl:63 [inlined]
  [4] __init__()
    @ CUDA_Runtime_jll C:\Users\andre\.julia\packages\CUDA_Runtime_jll\YgJCI\src\wrappers\x86_64-w64-mingw32-cuda+12.5.jl:56
  [5] run_module_init(mod::Module, i::Int64)
    @ Base .\loading.jl:1134
  [6] register_restored_modules(sv::Core.SimpleVector, pkg::Base.PkgId, path::String)
    @ Base .\loading.jl:1122
  [7] _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, depmods::Vector{Any})
    @ Base .\loading.jl:1067
  [8] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
    @ Base .\loading.jl:1581
  [9] _require(pkg::Base.PkgId, env::String)
    @ Base .\loading.jl:1938
 [10] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1812
 [11] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [12] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [13] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1803
 [14] macro expansion
    @ .\loading.jl:1790 [inlined]
 [15] macro expansion
    @ .\lock.jl:267 [inlined]
 [16] __require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1753
 [17] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [18] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [19] require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1746
 [20] include
    @ .\Base.jl:495 [inlined]
 [21] include_package_for_output(pkg::Base.PkgId, input::String, depot_path::Vector{String}, dl_load_path::Vector{String}, load_path::Vector{String}, concrete_deps::Vector{Pair{Base.PkgId, UInt128}}, source::Nothing)
    @ Base .\loading.jl:2222
 [22] top-level scope
    @ stdin:3
during initialization of module CUDA_Runtime_jll
in expression starting at C:\Users\andre\.julia\packages\CUDA\Tl08O\src\CUDA.jl:1
in expression starting at stdin:3
  ✗ CUDA
  0 dependencies successfully precompiled in 3 seconds. 65 already precompiled.

ERROR: The following 1 direct dependency failed to precompile:

CUDA [052768ef-5323-5732-b1bb-66c8b64840ba]

Failed to precompile CUDA [052768ef-5323-5732-b1bb-66c8b64840ba] to "C:\\Users\\andre\\.julia\\compiled\\v1.10\\CUDA\\jl_17CD.tmp".

Stacktrace:
  [1] pkgerror(msg::String)
    @ Pkg.Types C:\Users\andre\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Pkg\src\Types.jl:70
  [2] precompile(ctx::Pkg.Types.Context, pkgs::Vector{…}; internal_call::Bool, strict::Bool, warn_loaded::Bool, already_instantiated::Bool, timing::Bool, _from_loading::Bool, kwargs::@Kwargs{…})
    @ Pkg.API C:\Users\andre\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Pkg\src\API.jl:1659
  [3] precompile(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::@Kwargs{_from_loading::Bool})
    @ Pkg.API C:\Users\andre\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Pkg\src\API.jl:159
  [4] precompile
    @ C:\Users\andre\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Pkg\src\API.jl:147 [inlined]
  [5] #precompile#114
    @ C:\Users\andre\.julia\juliaup\julia-1.10.4+0.x64.w64.mingw32\share\julia\stdlib\v1.10\Pkg\src\API.jl:146 [inlined]
  [6] #invokelatest#2
    @ .\essentials.jl:894 [inlined]
  [7] invokelatest
    @ .\essentials.jl:889 [inlined]
  [8] _require(pkg::Base.PkgId, env::String)
    @ Base .\loading.jl:1963
  [9] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1812
 [10] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [11] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [12] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1803
 [13] macro expansion
    @ .\loading.jl:1790 [inlined]
 [14] macro expansion
    @ .\lock.jl:267 [inlined]
 [15] __require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1753
 [16] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [17] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [18] require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1746
Some type information was truncated. Use `show(err)` to see complete types.

Additional Context:

Computer Type: Dell G7 7500 Laptop, with Windows 11 Home (Version 22H2)

Here are the downloaded dependencies with ]add CUDA in a blank environment on a new julia build:

[621f4979] + AbstractFFTs v1.5.0 [79e6a3ab] + Adapt v4.0.4 [a9b6321e] + Atomix v0.1.0 [ab4f0b2a] + BFloat16s v0.5.0 [fa961155] + CEnum v0.5.0 [052768ef] + CUDA v5.4.3 [1af6417a] + CUDA_Runtime_Discovery v0.3.4 [3da002f7] + ColorTypes v0.11.5 [5ae59095] + Colors v0.12.11 [34da2185] + Compat v4.16.0 [a8cc5b0e] + Crayons v4.1.1 [9a962f9c] + DataAPI v1.16.0 [a93c6f00] + DataFrames v1.6.1 [864edb3b] + DataStructures v0.18.20 [e2d170a0] + DataValueInterfaces v1.0.0 [e2ba6199] + ExprTools v0.1.10 [53c48c17] + FixedPointNumbers v0.8.5 [0c68f7d7] + GPUArrays v10.3.0 [46192b85] + GPUArraysCore v0.1.6 ⌅ [61eb1bfa] + GPUCompiler v0.26.7 [842dd82b] + InlineStrings v1.4.2 [41ab1584] + InvertedIndices v1.3.0 [82899510] + IteratorInterfaceExtensions v1.0.0 [692b3bcd] + JLLWrappers v1.5.0 [63c18a36] + KernelAbstractions v0.9.23 [929cbde3] + LLVM v8.1.0 [8b046642] + LLVMLoopInfo v1.0.0 [b964fa9f] + LaTeXStrings v1.3.1 [1914dd2f] + MacroTools v0.5.13 [e1d29d7a] + Missings v1.2.0 [5da4648a] + NVTX v0.3.4 [bac558e1] + OrderedCollections v1.6.3 [2dfb63ee] + PooledArrays v1.4.3 [aea7be01] + PrecompileTools v1.2.1 [21216c6a] + Preferences v1.4.3 [08abe8d2] + PrettyTables v2.3.2 [74087812] + Random123 v1.7.0 [e6cf234a] + RandomNumbers v1.6.0 [189a3867] + Reexport v1.2.2 [ae029012] + Requires v1.3.0 [6c6a2e73] + Scratch v1.2.1 [91c51154] + SentinelArrays v1.4.5 [a2af1166] + SortingAlgorithms v1.2.1 [90137ffa] + StaticArrays v1.9.7 [1e83bf80] + StaticArraysCore v1.4.3 [892a3eda] + StringManipulation v0.3.4 [3783bdb8] + TableTraits v1.0.1 [bd369af6] + Tables v1.12.0 [a759f4b9] + TimerOutputs v0.5.24 [013be700] + UnsafeAtomics v0.2.1 [d80eeb9a] + UnsafeAtomicsLLVM v0.2.0 ⌅ [4ee394cb] + CUDA_Driver_jll v0.9.2+0 ⌅ [76a88914] + CUDA_Runtime_jll v0.14.1+0 [9c1d0b0a] + JuliaNVTXCallbacks_jll v0.2.1+0 ⌅ [dad2f222] + LLVMExtra_jll v0.0.31+0 [e98f9f5b] + NVTX_jll v3.1.0+2 [0dad84c5] + ArgTools v1.1.1 [56f22d72] + Artifacts [2a0f44e3] + Base64 [ade2ca70] + Dates [f43a241f] + Downloads v1.6.0 [7b1f6079] + FileWatching [9fa8497b] + Future [b77e0a4c] + InteractiveUtils [4af54fe1] + LazyArtifacts [b27032c2] + LibCURL v0.6.4 [76f85450] + LibGit2 [8f399da3] + Libdl [37e2e46d] + LinearAlgebra [56ddb016] + Logging [d6f4376e] + Markdown [ca575930] + NetworkOptions v1.2.0 [44cfe95a] + Pkg v1.10.0 [de0858da] + Printf [3fa0cd96] + REPL [9a3f8284] + Random [ea8e919c] + SHA v0.7.0 [9e88b42a] + Serialization [6462fe0b] + Sockets [2f01184e] + SparseArrays v1.10.0 [10745b16] + Statistics v1.10.0 [fa267f1f] + TOML v1.0.3 [a4e569a6] + Tar v1.10.0 [8dfed614] + Test [cf7118a7] + UUIDs [4ec0a83e] + Unicode [e66e0078] + CompilerSupportLibraries_jll v1.1.1+0 [deac9b47] + LibCURL_jll v8.4.0+0 [e37daf67] + LibGit2_jll v1.6.4+0 [29816b5a] + LibSSH2_jll v1.11.0+1 [c8ffd9c3] + MbedTLS_jll v2.28.2+1 [14a3606d] + MozillaCACerts_jll v2023.1.10 [4536629a] + OpenBLAS_jll v0.3.23+4 [bea87d4a] + SuiteSparse_jll v7.2.1+1 [83775a58] + Zlib_jll v1.2.13+1 [8e850b90] + libblastrampoline_jll v5.8.0+1 [8e850ede] + nghttp2_jll v1.52.0+1 [3f19e933] + p7zip_jll v17.4.0+2

Downloaded artifact: LLVMExtra Downloaded artifact: NVTX Downloaded artifact: JuliaNVTXCallbacks Downloaded artifact: CUDA_Runtime

maleadt commented 1 month ago

Have you installed the Visual C++ redistributable?

a-charbon commented 1 month ago

I already had two other versions - 2017 (14.16.27027) and 2010 (10.0.40219) - on my system, which I assumed from the description on the CUDA.jl home page and Quick Start / Installation Overview docs meant that I was good to go. However, I did not have the newer version you linked (14.29.30153). Downloading the updated one does appear to have fixed the issue.

maleadt commented 1 month ago

Great, thanks for confirming!