JuliaGPU / GPUArrays.jl

Reusable array functionality for Julia's various GPU backends.
MIT License
313 stars 74 forks source link

Vectorized getindex ignores `@inbounds` #542

Closed pxl-th closed 23 hours ago

pxl-th commented 1 week ago

Thus performing device-to-host copy during boundscheck.

julia> x = AMDGPU.rand(Float32, 16);

julia> x[[1, 2, 3, 4]];
[D to H] ROCArray{Bool, 1, AMDGPU.Runtime.Mem.HIPBuffer}: (1,) -> Vector{Bool}: (1,)

julia> @inbounds x[[1, 2, 3, 4]];
[D to H] ROCArray{Bool, 1, AMDGPU.Runtime.Mem.HIPBuffer}: (1,) -> Vector{Bool}: (1,)
maleadt commented 1 day ago

Top-level @inbounds doesn't work:

julia> struct Foo end

julia> Base.getindex(::Foo, Is...) = (@boundscheck(error("checking bounds")); 42)

julia> x = Foo()
Foo()

julia> x[1]
ERROR: checking bounds
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] getindex(::Foo, Is::Int64)
   @ Main ./REPL[29]:1
 [3] top-level scope
   @ REPL[31]:1

julia> @inbounds x[1]
ERROR: checking bounds
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] getindex(::Foo, Is::Int64)
   @ Main ./REPL[29]:1
 [3] top-level scope
   @ REPL[32]:1

julia> bar(x) = @inbounds x[1]
bar (generic function with 1 method)

julia> bar(x)
42

There does still seems to be something wrong with the vectorized getindex though.