Closed xkykai closed 2 months ago
Can you try using the branch ncc/use-julia-v1.9.4
which, despite its original name, uses Julia v1.10.0?
on tartarus with the above-mentioned branch things seem OK
navidcy:Oceananigans.jl/ |ncc/use-julia-v1.9.4 β|$ julia-1.10 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0 (2023-12-25)
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> using Oceananigans
[ Info: Oceananigans will use 48 threads
julia> grid = RectilinearGrid(GPU(),
size = (16, 16, 16),
x = (0, 1),
y = (0, 1),
z = (-1, 0),
topology = (Periodic, Periodic, Bounded))
16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo
βββ Periodic x β [0.0, 1.0) regularly spaced with Ξx=0.0625
βββ Periodic y β [0.0, 1.0) regularly spaced with Ξy=0.0625
βββ Bounded z β [-1.0, 0.0] regularly spaced with Ξz=0.0625
julia> model = NonhydrostaticModel(; grid)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
βββ grid: 16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo
βββ timestepper: QuasiAdamsBashforth2TimeStepper
βββ tracers: ()
βββ closure: Nothing
βββ buoyancy: Nothing
βββ coriolis: Nothing
julia> u, v, w = model.velocities
NamedTuple with 3 Fields on 16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo:
βββ u: 16Γ16Γ16 Field{Face, Center, Center} on RectilinearGrid on GPU
βββ v: 16Γ16Γ16 Field{Center, Face, Center} on RectilinearGrid on GPU
βββ w: 16Γ16Γ17 Field{Center, Center, Face} on RectilinearGrid on GPU
julia> maximum(u)
0.0
julia> maximum(w)
0.0
julia> maximum(v)
0.0
julia> maximum(abs, u)
0.0
julia> maximum(abs, w)
0.0
julia> maximum(abs, v)
0.0
While using main
indeed I can reproduce the error above...
navidcy:Oceananigans.jl/ |main β|$ julia-1.10 --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.10.0 (2023-12-25)
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> using Oceananigans
β Warning: The active manifest file has dependencies that were resolved with a different julia version (1.9.3). Unexpected behavior may occur.
β @ ~/Oceananigans.jl/Manifest.toml:0
β Warning: The project dependencies or compat requirements have changed since the manifest was last resolved.
β It is recommended to `Pkg.resolve()` or consider `Pkg.update()` if necessary.
β @ Pkg.API ~/julia-1.10/usr/share/julia/stdlib/v1.10/Pkg/src/API.jl:1800
Precompiling Oceananigans
1 dependency successfully precompiled in 21 seconds. 143 already precompiled.
[ Info: Oceananigans will use 48 threads
julia> grid = RectilinearGrid(GPU(),
size = (16, 16, 16),
x = (0, 1),
y = (0, 1),
z = (-1, 0),
topology = (Periodic, Periodic, Bounded))
16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo
βββ Periodic x β [0.0, 1.0) regularly spaced with Ξx=0.0625
βββ Periodic y β [0.0, 1.0) regularly spaced with Ξy=0.0625
βββ Bounded z β [-1.0, 0.0] regularly spaced with Ξz=0.0625
julia> model = NonhydrostaticModel(; grid)
NonhydrostaticModel{GPU, RectilinearGrid}(time = 0 seconds, iteration = 0)
βββ grid: 16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo
βββ timestepper: QuasiAdamsBashforth2TimeStepper
βββ tracers: ()
βββ closure: Nothing
βββ buoyancy: Nothing
βββ coriolis: Nothing
julia> u, v, w = model.velocities
NamedTuple with 3 Fields on 16Γ16Γ16 RectilinearGrid{Float64, Periodic, Periodic, Bounded} on GPU with 3Γ3Γ3 halo:
βββ u: 16Γ16Γ16 Field{Face, Center, Center} on RectilinearGrid on GPU
βββ v: 16Γ16Γ16 Field{Center, Face, Center} on RectilinearGrid on GPU
βββ w: 16Γ16Γ17 Field{Center, Center, Face} on RectilinearGrid on GPU
julia> maximum(u)
0.0
julia> maximum(w)
0.0
julia> maximum(v)
0.0
julia> maximum(abs, u)
0.0
julia> maximum(abs, w)
ERROR: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuLaunchKernel
@ ~/.julia/packages/CUDA/nbRJk/lib/utils/call.jl:26 [inlined]
[4] (::CUDA.var"#867#868"{Bool, Int64, CUDA.CuStream, CUDA.CuFunction, CUDA.CuDim3, CUDA.CuDim3})(kernelParams::Vector{Ptr{Nothing}})
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:69
[5] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:33 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] pack_arguments(::CUDA.var"#867#868"{β¦}, ::CUDA.KernelState, ::CartesianIndices{β¦}, ::CartesianIndices{β¦}, ::CUDA.CuDeviceArray{β¦}, ::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ CUDA ./none:0
[8] launch(f::CUDA.CuFunction, args::Vararg{β¦}; blocks::Union{β¦}, threads::Union{β¦}, cooperative::Bool, shmem::Integer, stream::CUDA.CuStream) where N
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:62 [inlined]
[9] #872
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:136 [inlined]
[10] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:95 [inlined]
[11] macro expansion
@ CUDA ./none:0 [inlined]
[12] convert_arguments
@ CUDA ./none:0 [inlined]
[13] #cudacall#871
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:135 [inlined]
[14] cudacall
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:134 [inlined]
[15] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:223 [inlined]
[16] macro expansion
@ CUDA ./none:0 [inlined]
[17] call(::CUDA.HostKernel{β¦}, ::typeof(identity), ::typeof(max), ::Nothing, ::CartesianIndices{β¦}, ::CartesianIndices{β¦}, ::Val{β¦}, ::CUDA.CuDeviceArray{β¦}, ::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; call_kwargs::@Kwargs{β¦})
@ CUDA ./none:0
[18] (::CUDA.HostKernel{β¦})(::Function, ::Vararg{β¦}; threads::Int64, blocks::Int64, kwargs::@Kwargs{β¦})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:345
[19] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:106 [inlined]
[20] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; init::Nothing)
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:271
[21] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:169
[22] mapreducedim!(f::Function, op::Function, R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/host/mapreduce.jl:10
[23] #maximum!#860
@ Base ./reducedim.jl:1034 [inlined]
[24] maximum!(f::Function, r::Field{β¦}, a::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; condition::Nothing, mask::Float64, kwargs::@Kwargs{β¦})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:618
[25] maximum(f::Function, c::Field{β¦}; condition::Nothing, mask::Float64, dims::Function)
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:648
[26] maximum(f::Function, c::Field{β¦})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:637
[27] top-level scope
@ REPL[9]:1
[28] top-level scope
@ ~/.julia/packages/CUDA/nbRJk/src/initialization.jl:205
Some type information was truncated. Use `show(err)` to see complete types.
julia> maximum(abs, v)
ERROR: CUDA error: too many resources requested for launch (code 701, ERROR_LAUNCH_OUT_OF_RESOURCES)
Stacktrace:
[1] throw_api_error(res::CUDA.cudaError_enum)
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:27
[2] check
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/libcuda.jl:34 [inlined]
[3] cuLaunchKernel
@ ~/.julia/packages/CUDA/nbRJk/lib/utils/call.jl:26 [inlined]
[4] (::CUDA.var"#867#868"{Bool, Int64, CUDA.CuStream, CUDA.CuFunction, CUDA.CuDim3, CUDA.CuDim3})(kernelParams::Vector{Ptr{Nothing}})
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:69
[5] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:33 [inlined]
[6] macro expansion
@ ./none:0 [inlined]
[7] pack_arguments(::CUDA.var"#867#868"{β¦}, ::CUDA.KernelState, ::CartesianIndices{β¦}, ::CartesianIndices{β¦}, ::CUDA.CuDeviceArray{β¦}, ::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ CUDA ./none:0
[8] launch(f::CUDA.CuFunction, args::Vararg{β¦}; blocks::Union{β¦}, threads::Union{β¦}, cooperative::Bool, shmem::Integer, stream::CUDA.CuStream) where N
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:62 [inlined]
[9] #872
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:136 [inlined]
[10] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:95 [inlined]
[11] macro expansion
@ CUDA ./none:0 [inlined]
[12] convert_arguments
@ CUDA ./none:0 [inlined]
[13] #cudacall#871
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:135 [inlined]
[14] cudacall
@ CUDA ~/.julia/packages/CUDA/nbRJk/lib/cudadrv/execution.jl:134 [inlined]
[15] macro expansion
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:223 [inlined]
[16] macro expansion
@ CUDA ./none:0 [inlined]
[17] call(::CUDA.HostKernel{β¦}, ::typeof(identity), ::typeof(max), ::Nothing, ::CartesianIndices{β¦}, ::CartesianIndices{β¦}, ::Val{β¦}, ::CUDA.CuDeviceArray{β¦}, ::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; call_kwargs::@Kwargs{β¦})
@ CUDA ./none:0
[18] (::CUDA.HostKernel{β¦})(::Function, ::Vararg{β¦}; threads::Int64, blocks::Int64, kwargs::@Kwargs{β¦})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:345
[19] macro expansion
@ ~/.julia/packages/CUDA/nbRJk/src/compiler/execution.jl:106 [inlined]
[20] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; init::Nothing)
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:271
[21] mapreducedim!(f::typeof(identity), op::typeof(max), R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ CUDA ~/.julia/packages/CUDA/nbRJk/src/mapreduce.jl:169
[22] mapreducedim!(f::Function, op::Function, R::SubArray{β¦}, A::Oceananigans.AbstractOperations.ConditionalOperation{β¦})
@ GPUArrays ~/.julia/packages/GPUArrays/EZkix/src/host/mapreduce.jl:10
[23] #maximum!#860
@ Base ./reducedim.jl:1034 [inlined]
[24] maximum!(f::Function, r::Field{β¦}, a::Oceananigans.AbstractOperations.ConditionalOperation{β¦}; condition::Nothing, mask::Float64, kwargs::@Kwargs{β¦})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:618
[25] maximum(f::Function, c::Field{β¦}; condition::Nothing, mask::Float64, dims::Function)
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:648
[26] maximum(f::Function, c::Field{β¦})
@ Oceananigans.Fields ~/Oceananigans.jl/src/Fields/field.jl:637
[27] top-level scope
@ REPL[10]:1
[28] top-level scope
@ ~/.julia/packages/CUDA/nbRJk/src/initialization.jl:205
Some type information was truncated. Use `show(err)` to see complete types.
That suggests that it's because the package dependencies on main
were resolved with Julia v1.9.3.
β Warning: The active manifest file has dependencies that were resolved with a different julia version (1.9.3). Unexpected behavior may occur.
This issue will be resolved when #3403 is merged.
It looks like the conditional reduction is too heavy for mapreduce
. Perhaps @simone-silvestri has ideas to resolve this.
The operation should not be too large since the grid is very small. Probably this is a symptom of a bug that does not affect the results but results in a waste of computational resources somewhere in conditional operation. I ll have a look
I think the size dependence has to do with how mapreduce
works; it breaks the reduction into chunks and (10, 10, 10) might be just one chunk.
I also had this issue, as new into GPU running, I was super confused about this error. It will be helpful if this issue is not fixable, to at least point out in the documentation.
I encountered this error by running a simulation based on the tutorial (Langmuir turbulence) in GPUs. Note that the print function prints the maximum(abs, u), maximum(abs, v), maximum(abs, w)
:
msg = @sprintf("i: %04d, t: %s, Ξt: %s, umax = (%.1e, %.1e, %.1e) msβ»ΒΉ, wall time: %s\n",
iteration(simulation),
prettytime(time(simulation)),
prettytime(simulation.Ξt),
maximum(abs, u), maximum(abs, v), maximum(abs, w),
prettytime(simulation.run_wall_time))
thus resulting in the error:
LoadError: CUDA error: too many resources requested for launch
For reference, the code works once the maximum
functions are removed:
msg = @sprintf("i: %04d, t: %s, οΏ½~Tt: %s, wall time: %s\n",
iteration(simulation),
prettytime(time(simulation)),
prettytime(simulation.οΏ½~Tt),
prettytime(simulation.run_wall_time))
reopening this
@simone-silvestri has declared an interest in fixing this
can you try maximum
without abs
?
I think its the abs
(probably any function) that's the main issue
@simone-silvestri, effectively if I try maximum
without abs
the printing function works well. @glwagner is right, any function within the maximum
creates the same issue (I tested with sum
).
Well sum
definitely won't work (it has to be a simple single-argument transformation) but you could try a function like
square(x) = x * x
or log
if you want to be adventurous
Is this still an issue? @xkykai's MWE runs fine for me (I went up to 256x256x256), and I've been doing maximum(abs, u)
on the GPU for a few versions.
Out of curiousity, @josuemtzmo are you able to reproduce the error on the latest versions of Julia, CUDA.jl, and Oceananigans.jl?
I'm using Oceananigans v0.91.7 with
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd4843 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 24 Γ AMD Ryzen 9 5900X 12-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 24 virtual cores)
and
julia> Oceananigans.CUDA.versioninfo()
CUDA runtime 12.5, artifact installation
CUDA driver 12.5
NVIDIA driver 556.12.0
CUDA libraries:
- CUBLAS: 12.5.3
- CURAND: 10.3.6
- CUFFT: 11.2.3
- CUSOLVER: 11.6.3
- CUSPARSE: 12.5.1
- CUPTI: 2024.2.1 (API 23.0.0)
- NVML: 12.0.0+556.12
Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0
Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7
1 device:
0: NVIDIA GeForce RTX 3080 (sm_86, 5.794 GiB / 10.000 GiB available)
Hello,
I've tested it in Oceananigans v0.91.8
with:
julia> versioninfo()
Julia Version 1.10.4
Commit 48d4fd48430 (2024-06-04 10:41 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 Γ Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake-avx512)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
Environment:
JULIA_CUDA_MEMORY_POOL = none
julia> Oceananigans.CUDA.versioninfo()
CUDA runtime 12.1, artifact installation
CUDA driver 12.1
NVIDIA driver 530.30.2
CUDA libraries:
- CUBLAS: 12.1.3
- CURAND: 10.3.2
- CUFFT: 11.0.2
- CUSOLVER: 11.4.5
- CUSPARSE: 12.1.0
- CUPTI: 2023.1.1 (API 18.0.0)
- NVML: 12.0.0+530.30.2
Julia packages:
- CUDA: 5.4.3
- CUDA_Driver_jll: 0.9.2+0
- CUDA_Runtime_jll: 0.14.1+0
Toolchain:
- Julia: 1.10.4
- LLVM: 15.0.7
Environment:
- JULIA_CUDA_MEMORY_POOL: none
Preferences:
- CUDA_Runtime_jll.version: 12.1
1 device:
0: Tesla V100-PCIE-32GB (sm_70, 30.884 GiB / 32.000 GiB available)
and the issue seems solved.
I agree with @ali-ramadhan, it seems that this issue was fixed at some point, although I haven't managed to pinpoint the version, I think I had the issue when I was using CUDA v5.1.2
(as discussed with @simone-silvestri) I encountered this bug when trying to upgrade to julia 1.10.0. What happens is
maximum(abs, v)
doesn't work for grids larger than (10, 10, 10). Howevermaximum(abs, u)
,maximum(abs, w)
,maximum(abs, b)
,maximum(u)
,maximum(v)
,maximum(w)
, andmaximum(b)
work just fine.Here's a MWE tested on Supercloud and Tartarus:
Note that line 20 is the last line of the code snippet above (
maximum(abs, v)
)Here's the Julia version info:
Here's the CUDA runtime version:
In Julia 1.9 this does not seem to be a problem.