CliMA / Oceananigans.jl

🌊 Julia software for fast, friendly, flexible, ocean-flavored fluid dynamics on CPUs and GPUs
https://clima.github.io/OceananigansDocumentation/stable
MIT License
988 stars 194 forks source link

CUDA scalar operations error on GPU in `show` functions #3770

Open ali-ramadhan opened 1 month ago

ali-ramadhan commented 1 month ago

I feel like this impacts usability, especially interactive use, but it's not a bug in the sense that your simulation scripts are fine. Is it worth trying to fix this?

Maybe not. After all, executing model.velocities.u.data in the REPL with a GPU model produces a similar error to the one below.

One nuclear option is to allow scalar operations in show methods, but in this particular example it's for a SubArray{OffsetVector{CuArray}} so not a type we have control over without piracy.


MWE:

using Oceananigans

grid = RectilinearGrid(GPU(), size=(12, 12, 12), x=(0, 1), y=(0, 1), z=k->√k)

znodes(grid, Center(), Center(), Center())

produces this error when trying to show the result

12-element view(OffsetArray(::CuArray{Float64, 1, CUDA.DeviceMemory}, -2:15), 1:12) with eltype Float64:
Error showing value of type SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}:
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
  [6] getindex
    @ ~/.julia/packages/OffsetArrays/hwmnB/src/OffsetArrays.jl:438 [inlined]
  [7] isassigned(A::OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, i::Int64)
    @ Base ./multidimensional.jl:1587
  [8] isassigned
    @ ./subarray.jl:386 [inlined]
  [9] isassigned(::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}, ::Int64, ::Int64)
    @ Base ./multidimensional.jl:1582
 [10] alignment(io::IOContext{Base.TTY}, X::AbstractVecOrMat, rows::Vector{Int64}, cols::Vector{Int64}, cols_if_complete::Int64, cols_otherwise::Int64, sep::Int64, ncols::Int64)
    @ Base ./arrayshow.jl:68
 [11] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base ./arrayshow.jl:207
 [12] print_matrix(io::IOContext{Base.TTY}, X::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64)
    @ Base ./arrayshow.jl:171
 [13] print_matrix
    @ ./arrayshow.jl:171 [inlined]
 [14] print_array
    @ ./arrayshow.jl:358 [inlined]
 [15] show(io::IOContext{Base.TTY}, ::MIME{Symbol("text/plain")}, X::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true})
    @ Base ./arrayshow.jl:399
 [16] (::REPL.var"#55#56"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:273
 [17] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [18] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:259
 [19] display
    @ ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:278 [inlined]
 [20] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:340
 [21] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [22] invokelatest
    @ ./essentials.jl:889 [inlined]
 [23] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:315
 [24] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:284
 [25] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [26] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:282
 [27] (::REPL.var"#do_respond#80"{Bool, Bool, REPL.var"#93#103"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:911
 [28] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [29] invokelatest
    @ ./essentials.jl:889 [inlined]
 [30] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/LineEdit.jl:2656
 [31] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:1312
 [32] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:386

This doesn't happen for regularly spaces dimensions:

julia> xnodes(grid, Center(), Center(), Center())
12-element view(OffsetArray(::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, -2:15), 1:12) with eltype Float64:
 0.041666666666666664
 0.125
 0.20833333333333334
 0.2916666666666667
 0.375
 0.4583333333333333
 0.5416666666666666
 0.625
 0.7083333333333334
 0.7916666666666666
 0.875
 0.9583333333333334

because there is no data on the GPU to show

julia> xnodes(grid, Center(), Center(), Center()) |> typeof
SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Tuple{UnitRange{Int64}}, true}

julia> znodes(grid, Center(), Center(), Center()) |> typeof
SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}
glwagner commented 1 month ago

Technically we cannot fix this problem here as long as the nodes view into a CuArray, since we don't own any of those types. We might have to submit a PR to CUDA for this? I believe CUDA supports one-level-nested CuArray but not twice.

Another solution is to re-form the nodes so that they are a plain view. This should work, it's merely laziness / convenience that we make a view into an OffsetArray, rather than extracting a view directly from the parent (which would require computing interior indices.

Another solution would be to use Field for the nodes, or even an abstract operation that computes them on the fly. This would allow the nodes to participate in abstract operations. Now that we have a Makie recipe too, one argument for not doing this (inconvenient plotting) may be eliminated. There might be other inconveniences though.