Closed behinger closed 1 year ago
Hi @behinger!
I have an error in both cases here:
julia> out[:,1] = cx # this works
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/HaQcr/src/GPUArraysCore.jl:103
[3] getindex
@ ~/.julia/packages/GPUArrays/6STCb/src/host/indexing.jl:9 [inlined]
[4] iterate
@ ./abstractarray.jl:1167 [inlined]
[5] iterate
@ ./abstractarray.jl:1165 [inlined]
[6] _unsafe_setindex!(::IndexLinear, ::Matrix{Float64}, ::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ::Base.Slice{Base.OneTo{Int64}}, ::Int64)
@ Base ./multidimensional.jl:940
[7] _setindex!
@ ./multidimensional.jl:930 [inlined]
[8] setindex!(::Matrix{Float64}, ::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ::Function, ::Int64)
@ Base ./abstractarray.jl:1344
[9] top-level scope
@ REPL[11]:1
[10] top-level scope
@ ~/git/CUDA.jl/src/initialization.jl:163
julia> lsmr_solver = Krylov.LsmrSolver(size(cA)..., CuVector{Float64})
┌──────────────────┬───────────────────┬──────────────────┐
│ LsmrSolver│ nrows: 64│ ncols: 64│
├──────────────────┼───────────────────┼──────────────────┤
│Precision: Float64│ Architecture: GPU│Storage: 3.618 KiB│
├──────────────────┼───────────────────┼──────────────────┤
│ Attribute│ Type│ Size│
├──────────────────┼───────────────────┼──────────────────┤
│ m│ Int64│ 8 bytes│
│ n│ Int64│ 8 bytes│
│ x│CuArray{Float64, 1}│ 512 bytes│
│ Nv│CuArray{Float64, 1}│ 512 bytes│
│ Aᴴu│CuArray{Float64, 1}│ 512 bytes│
│ h│CuArray{Float64, 1}│ 512 bytes│
│ hbar│CuArray{Float64, 1}│ 512 bytes│
│ Mu│CuArray{Float64, 1}│ 512 bytes│
│ Av│CuArray{Float64, 1}│ 512 bytes│
│ err_vec│ Vector{Float64}│ 40 bytes│
│ stats│ LsmrStats{Float64}│ 65 bytes│
└──────────────────┴───────────────────┴──────────────────┘
LsmrStats
niter: 0
solved: false
inconsistent: false
residuals: []
Aresiduals: []
residual: 0.0
Aresidual: 0.0
κ₂(A): 0.0
‖A‖F: 0.0
xNorm: 0.0
timer: 0.00μs
status: unknown
julia> Krylov.solve!(lsmr_solver, cA, cb; history=true)
┌──────────────────┬───────────────────┬──────────────────┐
│ LsmrSolver│ nrows: 64│ ncols: 64│
├──────────────────┼───────────────────┼──────────────────┤
│Precision: Float64│ Architecture: GPU│Storage: 5.240 KiB│
├──────────────────┼───────────────────┼──────────────────┤
│ Attribute│ Type│ Size│
├──────────────────┼───────────────────┼──────────────────┤
│ m│ Int64│ 8 bytes│
│ n│ Int64│ 8 bytes│
│ x│CuArray{Float64, 1}│ 512 bytes│
│ Nv│CuArray{Float64, 1}│ 512 bytes│
│ Aᴴu│CuArray{Float64, 1}│ 512 bytes│
│ h│CuArray{Float64, 1}│ 512 bytes│
│ hbar│CuArray{Float64, 1}│ 512 bytes│
│ Mu│CuArray{Float64, 1}│ 512 bytes│
│ Av│CuArray{Float64, 1}│ 512 bytes│
│ err_vec│ Vector{Float64}│ 40 bytes│
│ stats│ LsmrStats{Float64}│ 1.686 KiB│
└──────────────────┴───────────────────┴──────────────────┘
LsmrStats
niter: 101
solved: true
inconsistent: true
residuals: [ 8.0e+00 6.8e-01 4.5e-01 ... 2.0e-01 2.0e-01 2.0e-01 ]
Aresiduals: [ 2.5e+02 1.6e+00 7.0e-01 ... 1.4e-04 1.4e-04 1.4e-04 ]
residual: 0.19902642812751492
Aresidual: 0.00013911918604509325
κ₂(A): 32937.386487626245
‖A‖F: 125.32881873355007
xNorm: 1.4031718348848208
timer: 93.31ms
status: truncated forward error small enough
julia> out[:,2] = lsmr_solver.x # this throws an error
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:35
[2] assertscalar(op::String)
@ GPUArraysCore ~/.julia/packages/GPUArraysCore/HaQcr/src/GPUArraysCore.jl:103
[3] getindex
@ ~/.julia/packages/GPUArrays/6STCb/src/host/indexing.jl:9 [inlined]
[4] iterate
@ ./abstractarray.jl:1167 [inlined]
[5] iterate
@ ./abstractarray.jl:1165 [inlined]
[6] _unsafe_setindex!(::IndexLinear, ::Matrix{Float64}, ::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ::Base.Slice{Base.OneTo{Int64}}, ::Int64)
@ Base ./multidimensional.jl:940
[7] _setindex!
@ ./multidimensional.jl:930 [inlined]
[8] setindex!(::Matrix{Float64}, ::CuArray{Float64, 1, CUDA.Mem.DeviceBuffer}, ::Function, ::Int64)
@ Base ./abstractarray.jl:1344
[9] top-level scope
@ REPL[14]:1
[10] top-level scope
@ ~/git/CUDA.jl/src/initialization.jl:163
well, that is a surprise and much more saner than what we got. Indeed, we just now tried on a fresh REPL and we got the same result. Maybe we had a weird out-of-order problem, but we thought we checked on a fresh repl. Sorry & thanks!
It now makes much more sense to us also, CUDA warns us, that moving a CuArray into a normal Array is an operation moving memory from GPU to CPU, which you dont want to do typically.
PS: The GPU Krylov moved our fit from 75s to 2s, so that is "nice". Thanks a thousand times!!
PS: The GPU Krylov moved our fit from 75s to 2s, so that is "nice". Thanks a thousand times!!
Great!!! May I ask you the field of your application and the size of the least-squares problem? It's always a pleasure to have this kind of feedbacks on my thesis project. 🙂
yeah sure! It is not yet there, but will be in https://github.com/unfoldtoolbox/unfold.jl/ (currently switching everything to extensions). The problems we solve are typically ~3 million rows x 10-40k columns, sparsity around 0.1%-1%, with a scaled toeplitz/diagnoal kind of summation/concatination we typically have 20-30 subjects with each 30-120 electrodes, for each we solve the same problem (PS: I don't think lsmr allows for hot-start, but maybe I simply missed it)
PPS: I dont use preconditioner right now because I dont have any background in that field :-D
Hey! we (@jschepers) and myself stumbled over an
ERROR: LoadError: Scalar indexing is disallowed.
error when accessing the output of a solver.Easiest visible in this MWE:
Why does accessing the CuArray
lsmr_solver.x
directly result in such an error in the second case, but not when being outputted ascx
in the first example viacx,cstat = lsmr(cA,cB)
?Cheers, Judith & Benedikt