View arrays on GPU cause scalar indexing error

The following snippet causes scalar indexing error on GPU, which is reproducible on the latest release and the master branch.

using CUDA
using Flux
using GraphNeuralNetworks
CUDA.allowscalar(false)

g = rand_graph(128, 512)
xi = randn(Float32, 4, 128)

# Having both of these two lines is needed to reproduce the bug.
g, xi = gpu(g), gpu(xi)
xi = xj = view(xi, axes(xi)...)

# ERROR: LoadError: Scalar indexing is disallowed.
apply_edges((xi, xj, e) -> 0, g; xi, xj = xi)

kenta@lizzle:~/tmp$ julia -q
(tmp) pkg> st
Status `~/tmp/Project.toml`
⌅ [052768ef] CUDA v4.4.1
  [587475ba] Flux v0.14.6
  [cffab07f] GraphNeuralNetworks v0.6.14 `https://github.com/CarloLucibello/GraphNeuralNetworks.jl.git#master`
⌃ [02a925ec] cuDNN v1.1.1
Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`

julia> 
kenta@lizzle:~/tmp$ julia gnn.jl 
ERROR: LoadError: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/uOYfN/src/GPUArraysCore.jl:103
  [3] getindex
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/indexing.jl:9 [inlined]
  [4] getindex
    @ ~/.julia/packages/GPUArrays/5XhED/src/host/indexing.jl:30 [inlined]
  [5] gather!(dst::CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, src::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, idx::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer})
    @ NNlib ~/.julia/packages/NNlib/5iRSB/src/gather.jl:107
  [6] gather
    @ ~/.julia/packages/NNlib/5iRSB/src/gather.jl:46 [inlined]
  [7] _gather(x::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, i::CuArray{Int64, 1, CUDA.Mem.DeviceBuffer})
    @ GraphNeuralNetworks.GNNGraphs ~/.julia/packages/GraphNeuralNetworks/gujpQ/src/GNNGraphs/gatherscatter.jl:4
  [8] apply_edges(f::var"#3#4", g::GNNGraph{Tuple{CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, Nothing}}, xi::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, xj::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, e::Nothing)
    @ GraphNeuralNetworks ~/.julia/packages/GraphNeuralNetworks/gujpQ/src/msgpass.jl:146
  [9] apply_edges(f::Function, g::GNNGraph{Tuple{CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, CuArray{Int64, 1, CUDA.Mem.DeviceBuffer}, Nothing}}; xi::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, xj::SubArray{Float32, 2, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}, false}, e::Nothing)
    @ GraphNeuralNetworks ~/.julia/packages/GraphNeuralNetworks/gujpQ/src/msgpass.jl:139
 [10] top-level scope
    @ ~/tmp/gnn.jl:14
in expression starting at /home/kenta/tmp/gnn.jl:14

I think this is a regression because it starts to happen since GraphNeuralNetworks.jl 0.6.8 (see below). I'm not sure which package actually causes this error, but I realized it when I updated GraphNeuralNetworks.jl.

kenta@lizzle:~/tmp$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.3 (2023-08-24)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(tmp) pkg> add GraphNeuralNetworks@0.6.7
   Resolving package versions...
    Updating `~/tmp/Project.toml`
⌅ [587475ba] ↓ Flux v0.14.6 ⇒ v0.13.17
⌃ [cffab07f] ~ GraphNeuralNetworks v0.6.14 `https://github.com/CarloLucibello/GraphNeuralNetworks.jl.git#master` ⇒ v0.6.7
    Updating `~/tmp/Manifest.toml`
⌅ [587475ba] ↓ Flux v0.14.6 ⇒ v0.13.17
⌃ [cffab07f] ~ GraphNeuralNetworks v0.6.14 `https://github.com/CarloLucibello/GraphNeuralNetworks.jl.git#master` ⇒ v0.6.7
⌅ [872c559c] ↓ NNlib v0.9.7 ⇒ v0.8.21
  [a00861dc] + NNlibCUDA v0.2.7
⌅ [3bd65402] ↓ Optimisers v0.3.1 ⇒ v0.2.20
        Info Packages marked with ⌃ and ⌅ have new versions available, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`
Precompiling project...
  7 dependencies successfully precompiled in 16 seconds. 118 already precompiled.

(tmp) pkg> 
kenta@lizzle:~/tmp$ julia gnn.jl  # this works
kenta@lizzle:~/tmp$ julia
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.3 (2023-08-24)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(tmp) pkg> add GraphNeuralNetworks@0.6.8
   Resolving package versions...
   Installed GraphNeuralNetworks ─ v0.6.8
    Updating `~/tmp/Project.toml`
  [587475ba] ↑ Flux v0.13.17 ⇒ v0.14.6
⌃ [cffab07f] ↑ GraphNeuralNetworks v0.6.7 ⇒ v0.6.8
    Updating `~/tmp/Manifest.toml`
  [587475ba] ↑ Flux v0.13.17 ⇒ v0.14.6
⌃ [cffab07f] ↑ GraphNeuralNetworks v0.6.7 ⇒ v0.6.8
  [872c559c] ↑ NNlib v0.8.21 ⇒ v0.9.7
  [a00861dc] - NNlibCUDA v0.2.7
  [3bd65402] ↑ Optimisers v0.2.20 ⇒ v0.3.1
        Info Packages marked with ⌃ have new versions available and may be upgradable.
Precompiling project...
  10 dependencies successfully precompiled in 22 seconds. 118 already precompiled.

(tmp) pkg> 
kenta@lizzle:~/tmp$ julia gnn.jl 
ERROR: LoadError: Scalar indexing is disallowed.

CarloLucibello / GraphNeuralNetworks.jl

View arrays on GPU cause scalar indexing error #349