SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
279 stars 29 forks source link

Ensemble simulation's solver parameter support on v1.18 #176

Closed TheStarAlight closed 1 year ago

TheStarAlight commented 2 years ago

I'm very glad to see the update of v1.18 :D However, solver parameter support seems incomplete, and I just encountered this problem. Still using the lorentz example:

using DiffEqGPU, OrdinaryDiffEq, CUDA, StaticArrays, BenchmarkTools
function lorenz(u, p, t)
    σ = p[1]
    ρ = p[2]
    β = p[3]
    du1 = σ * (u[2] - u[1])
    du2 = u[1] * (ρ - u[3]) - u[2]
    du3 = u[1] * u[2] - β * u[3]
    return @SVector [du1, du2, du3]
end
u0 = @SVector [1.0; 0.0; 0.0]
tspan = (0.0, 10.0)
p = @SVector [10.0, 28.0, 8 / 3.0]
prob = ODEProblem(lorenz, u0, tspan, p)
prob_func = (prob, i, repeat) -> remake(prob, p=p)
monteprob = EnsembleProblem(prob, prob_func=prob_func, safetycopy=false)
trajNum = 10000
CUDA.@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(), trajectories=trajNum, adaptive=true, save_everystep=false)

Compared with my former successful attempt where adaptive=true, save_everystep=false are not assigned, this time an error is thrown:

julia> CUDA.@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(), trajectories=trajNum, adaptive=true, save_everystep=false)

ERROR: InvalidIRError: compiling kernel #atsit5_kernel(CuDeviceVector{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CuDeviceMatrix{SVector{3, Float64}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{false}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to gpuatsit5_init)
Stacktrace:
 [1] atsit5_kernel
   @ C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\gpu_tsit5.jl:427
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(DiffEqGPU.atsit5_kernel), Tuple{CuDeviceVector{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CuDeviceMatrix{SVector{3, Float64}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{false}}}}, args::LLVM.Module)
    @ GPUCompiler C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\validation.jl:141
  [2] macro expansion
    @ C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\driver.jl:418 [inlined]
  [3] macro expansion
    @ C:\Users\ZMY\.julia\packages\TimerOutputs\jgSVI\src\TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\driver.jl:416 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob, ctx::LLVM.Context)
    @ CUDA C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:354
  [7] #224
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:347 [inlined]
  [8] JuliaContext(f::CUDA.var"#224#225"{GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(DiffEqGPU.atsit5_kernel), Tuple{CuDeviceVector{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CuDeviceMatrix{SVector{3, Float64}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{false}}}}})
    @ GPUCompiler C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\driver.jl:76
  [9] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:346
 [10] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler C:\Users\ZMY\.julia\packages\GPUCompiler\jVY4I\src\cache.jl:90
 [11] cufunction(f::typeof(DiffEqGPU.atsit5_kernel), tt::Type{Tuple{CuDeviceVector{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1}, CuDeviceMatrix{SVector{3, Float64}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{false}}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:299
 [12] cufunction
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:293 [inlined]
 [13] macro expansion
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\compiler\execution.jl:102 [inlined]
 [14] vectorized_asolve(probs::CuArray{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, 1, CUDA.Mem.DeviceBuffer}, prob::ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, alg::SimpleDiffEq.GPUSimpleATsit5; dt::Float32, saveat::Nothing, save_everystep::Bool, abstol::Float32, reltol::Float32, debug::Bool, kwargs::Base.Pairs{Symbol, DiffEqGPU.var"#12#18", Tuple{Symbol}, NamedTuple{(:unstable_check,), Tuple{DiffEqGPU.var"#12#18"}}})
    @ DiffEqGPU C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\gpu_tsit5.jl:266
 [15] batch_solve(ensembleprob::EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::GPUTsit5, ensemblealg::EnsembleGPUKernel, I::UnitRange{Int64}, adaptive::Bool; kwargs::Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:unstable_check, :save_everystep), Tuple{DiffEqGPU.var"#12#18", Bool}}})
    @ DiffEqGPU C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\DiffEqGPU.jl:324
 [16] macro expansion
    @ .\timing.jl:299 [inlined]
 [17] __solve(ensembleprob::EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::GPUTsit5, ensemblealg::EnsembleGPUKernel; trajectories::Int64, batch_size::Int64, unstable_check::Function, adaptive::Bool, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:save_everystep,), Tuple{Bool}}})
    @ DiffEqGPU C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\DiffEqGPU.jl:231
 [18] #solve#35
    @ C:\Users\ZMY\.julia\packages\DiffEqBase\72SnT\src\solve.jl:818 [inlined]
 [19] macro expansion
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\utilities.jl:25 [inlined]
 [20] top-level scope
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\pool.jl:490 [inlined]
 [21] top-level scope
    @ .\REPL[2]:0
 [22] top-level scope
    @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\initialization.jl:52
TheStarAlight commented 2 years ago
OMG, just forgot the dt argument, added the dt and the solver ran successfully. So to sum up (dt should always be assigned) adaptive=true adaptive=false
save_everystep=true err (not supported)
save_everystep=false
TheStarAlight commented 2 years ago

But here comes another problem, I try to adjust the ratio of load to CPU (using the default EnsembleGPUKernel() which is actually EnsembleGPUKernel(0.2)), but every trial failed. Only EnsembleGPUKernel(0.0) passed with no err.

julia> CUDA.@time sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(0.2), trajectories=trajNum, adaptive=false, save_everystep=false, dt=0.1)
ERROR: TaskFailedException
Stacktrace:
 [1] wait
   @ .\task.jl:334 [inlined]
 [2] __solve(ensembleprob::EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::GPUTsit5, ensemblealg::EnsembleGPUKernel; trajectories::Int64, batch_size::Int64, unstable_check::Function, adaptive::Bool, kwargs::Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}})
   @ DiffEqGPU C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\DiffEqGPU.jl:235
 [3] #solve#35
   @ C:\Users\ZMY\.julia\packages\DiffEqBase\72SnT\src\solve.jl:818 [inlined]
 [4] macro expansion
   @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\utilities.jl:25 [inlined]
 [5] top-level scope
   @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\pool.jl:490 [inlined]
 [6] top-level scope
   @ .\REPL[11]:0
 [7] top-level scope
   @ C:\Users\ZMY\.julia\packages\CUDA\DfvRa\src\initialization.jl:52

    nested task error: TaskFailedException
    Stacktrace:
     [1] wait
       @ .\task.jl:334 [inlined]
     [2] threading_run(func::Function)
       @ Base.Threads .\threadingconstructs.jl:38
     [3] macro expansion
       @ .\threadingconstructs.jl:97 [inlined]
     [4] tmap(f::Function, args::UnitRange{Int64})
       @ SciMLBase C:\Users\ZMY\.julia\packages\SciMLBase\cheTp\src\ensemble\basic_ensemble_solve.jl:173
     [5] #solve_batch#505
       @ C:\Users\ZMY\.julia\packages\SciMLBase\cheTp\src\ensemble\basic_ensemble_solve.jl:164 [inlined]
     [6] f
       @ C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\DiffEqGPU.jl:221 [inlined]
     [7] macro expansion
       @ C:\Users\ZMY\.julia\packages\DiffEqGPU\EdiL9\src\DiffEqGPU.jl:226 [inlined]
     [8] (::DiffEqGPU.var"#8#14"{DiffEqGPU.var"#f#13"{Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}}, EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, SimpleDiffEq.GPUSimpleTsit5, UnitRange{Int64}}})()
       @ DiffEqGPU .\task.jl:123

        nested task error: MethodError: no method matching solve(::ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::SimpleDiffEq.GPUSimpleTsit5; save_everystep=false, dt=0.1)
        Closest candidates are:
          solve(::ODEProblem, ::SimpleDiffEq.GPUSimpleTsit5; dt) at C:\Users\ZMY\.julia\packages\SimpleDiffEq\vIpeG\src\tsit5\gpuatsit5.jl:9 got unsupported keyword argument "save_everystep"
          solve(::SciMLBase.AbstractDEProblem, ::Any...; sensealg, u0, p, kwargs...) at C:\Users\ZMY\.julia\packages\DiffEqBase\72SnT\src\solve.jl:765
          solve(::Any...; kwargs...) at C:\Users\ZMY\.julia\packages\CommonSolve\TGRvG\src\CommonSolve.jl:23
          ...
        Stacktrace:
         [1] kwerr(::NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}, ::Function, ::ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ::SimpleDiffEq.GPUSimpleTsit5)
           @ Base .\error.jl:163
         [2] batch_func(i::Int64, prob::EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::SimpleDiffEq.GPUSimpleTsit5; kwargs::Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}})
           @ SciMLBase C:\Users\ZMY\.julia\packages\SciMLBase\cheTp\src\ensemble\basic_ensemble_solve.jl:92
         [3] (::SciMLBase.var"#507#509"{Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}}, EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, SimpleDiffEq.GPUSimpleTsit5})(i::Int64)
           @ SciMLBase C:\Users\ZMY\.julia\packages\SciMLBase\cheTp\src\ensemble\basic_ensemble_solve.jl:165
         [4] macro expansion
           @ C:\Users\ZMY\.julia\packages\SciMLBase\cheTp\src\ensemble\basic_ensemble_solve.jl:174 [inlined]
         [5] (::SciMLBase.var"#400#threadsfor_fun#510"{SciMLBase.var"#507#509"{Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}}, EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, SimpleDiffEq.GPUSimpleTsit5}, Tuple{UnitRange{Int64}}, Vector{Union{}}, UnitRange{Int64}})(onethread::Bool)
           @ SciMLBase .\threadingconstructs.jl:85
         [6] (::SciMLBase.var"#400#threadsfor_fun#510"{SciMLBase.var"#507#509"{Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:save_everystep, :dt), Tuple{Bool, Float64}}}, EnsembleProblem{ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, var"#1#2", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, SimpleDiffEq.GPUSimpleTsit5}, Tuple{UnitRange{Int64}}, Vector{Union{}}, UnitRange{Int64}})()
           @ SciMLBase .\threadingconstructs.jl:52
TheStarAlight commented 2 years ago

It seems that the program throws exceptions only when EnsembleGPUKernel(x≠0), so I guess it might be raised by missing of dispatch, there's no appropriate method solving GPUTsit5 on CPU which accepts the save_everystep arg. I refered to SimpleDiffEq\vIpeG\src\tsit5\gpuatsit5.jl and found the non-adaptive version solve() doesn't support the argument as expected. The above case is for non-adaptive case (adaptive=false). In SimpleDiffEq\vIpeG\src\tsit5\gpuatsit5.jl the adaptive version solve() has the kwarg save_everystep, however, some other kinds of err are thrown... Is the function unprepared for use?

utkarsh530 commented 2 years ago

SimpleDiffEq was the initial version of the now matured OrdinaryDiffEq.jl. Hence, it was not adequately maintained. When using the CPU version, the new solvers dispatches to SimpleDiffEq.jl to provide a reasonable performance estimate compared to the GPU ones. I'll fix things.

And yes, you'll need to provide the initial dt.

Just a heads up, usage Float32 is preferred with GPU. https://cuda.juliagpu.org/stable/tutorials/introduction/#A-simple-example-on-the-CPU

TheStarAlight commented 2 years ago

Thanks for your advice! I know GPU is better at single float computing than in double, but in my recent program which solves a quantum-mechanic wave equation, the program didn't give a satisfactory result unless I switch to double float, it seems in QM simulation you must use double float. I have given up the idea since then.😂 Anyway, many thanks.

ChrisRackauckas commented 1 year ago

@utkarsh530 was this handle?

utkarsh530 commented 1 year ago

Yes, need to tag SimpleDiffEq.jl.

ChrisRackauckas commented 1 year ago

Tagged.