SciML / SteadyStateDiffEq.jl

Solvers for steady states in scientific machine learning (SciML)
Other
30 stars 22 forks source link

Error with SteadyStateProblem and SSRootfind on GPU #39

Closed p-hss closed 1 year ago

p-hss commented 1 year ago

Hi, I am a Julia beginner so this might be a trivial mistake on my side... When I run the following code

using DifferentialEquations, DiffEqSensitivity, CUDA

CUDA.allowscalar(false) 

N = 10
p = ones(N)
u0 = ones(N)

p = cu(p)
u0 = cu(u0)

f(u, p, t) = u .* u .- (2f0 .* p) 

probN = SteadyStateProblem(f, u0, p)
solution = solve(probN, SSRootfind())

CUDA.allowscalar(true) 
println(Array(solution))

I get the error:

ERROR: LoadError: Scalar indexing is disallowed.
Invocation of setindex! resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore are only permitted from the REPL for prototyping purposes.
If you did intend to index this array, annotate the caller with @allowscalar.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:33
  [2] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/rSIl2/src/GPUArraysCore.jl:78
  [3] setindex!
    @ ~/.julia/packages/GPUArrays/gok9K/src/host/indexing.jl:17 [inlined]
  [4] trust_region_(df::NLSolversBase.OnceDifferentiable{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, xtol::Float32, ftol::Float32, iterations::Int64, store_trace::Bool, show_trace::Bool, extended_trace::Bool, factor::Float32, autoscale::Bool, cache::NLsolve.NewtonTrustRegionCache{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}})
    @ NLsolve ~/.julia/packages/NLsolve/gJL1I/src/solvers/trust_region.jl:150
  [5] trust_region (repeats 2 times)
    @ ~/.julia/packages/NLsolve/gJL1I/src/solvers/trust_region.jl:235 [inlined]
  [6] nlsolve(df::NLSolversBase.OnceDifferentiable{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}}, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}; method::Symbol, xtol::Float32, ftol::Float64, iterations::Int64, store_trace::Bool, show_trace::Bool, extended_trace::Bool, linesearch::LineSearches.Static, linsolve::NLsolve.var"#27#29", factor::Float32, autoscale::Bool, m::Int64, beta::Int64, aa_start::Int64, droptol::Float32)
    @ NLsolve ~/.julia/packages/NLsolve/gJL1I/src/nlsolve/nlsolve.jl:26
  [7] nlsolve(f::Function, initial_x::CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}; method::Symbol, autodiff::Symbol, inplace::Bool, kwargs::Base.Pairs{Symbol, Float64, Tuple{Symbol}, NamedTuple{(:ftol,), Tuple{Float64}}})
    @ NLsolve ~/.julia/packages/NLsolve/gJL1I/src/nlsolve/nlsolve.jl:52
  [8] #2
    @ ~/.julia/packages/SteadyStateDiffEq/qtEru/src/algorithms.jl:6 [inlined]
  [9] __solve(::SteadyStateProblem{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ODEFunction{false, typeof(f), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::SSRootfind{SteadyStateDiffEq.var"#2#4"}; abstol::Float64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ SteadyStateDiffEq ~/.julia/packages/SteadyStateDiffEq/qtEru/src/solve.jl:44
 [10] __solve
    @ ~/.julia/packages/SteadyStateDiffEq/qtEru/src/solve.jl:5 [inlined]
 [11] #solve_call#28
    @ ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:437 [inlined]
 [12] solve_call
    @ ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:409 [inlined]
 [13] #solve_up#34
    @ ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:780 [inlined]
 [14] solve_up
    @ ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:765 [inlined]
 [15] #solve#33
    @ ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:760 [inlined]
 [16] solve(prob::SteadyStateProblem{CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, false, CuArray{Float32, 1, CUDA.Mem.DeviceBuffer}, ODEFunction{false, typeof(f), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, args::SSRootfind{SteadyStateDiffEq.var"#2#4"})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/tp3vN/src/solve.jl:755
 [17] top-level scope
    @ ~/projects/lpjml/src/root_finding_diff.jl:15
in expression starting at /home/ftei-dsw/projects/lpjml/src/root_finding_diff.jl:15

I am not sure what the problem is and any help would be greatly appreciated! :) My environment is:

[052768ef] CUDA v3.12.0 [41bf760c] DiffEqSensitivity v6.79.0 [0c46a032] DifferentialEquations v7.2.0

and the GPU is a Quadro RTX 5000 Mobile / Max-Q.

ChrisRackauckas commented 1 year ago

SSRootfind defaults to using NLsolve.jl which doesn't support GPUs. This is something I plan to work on via NonlinearSolve.jl rather soon.

p-hss commented 1 year ago

Okay I see! Thanks a lot for the quick response!