SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
279 stars 29 forks source link

New GPU Tsit 5 solvers #148

Closed utkarsh530 closed 2 years ago

utkarsh530 commented 2 years ago

Hi,

I have added Tim Besard's work here to continue our implementation. @ChrisRackauckas

utkarsh530 commented 2 years ago

@maleadt I am trying to implement the adaptive version of vectorized_solve called vectorized_asolve on similar lines. However, I am new to CUDA programming and tried to follow this https://github.com/SciML/SimpleDiffEq.jl/blob/master/src/tsit5/gpuatsit5.jl#L64-L178 to have an adaptive version. However, I am getting this error:

julia> gpu_sol = vectorized_asolve(prob, ps, GPUSimpleTsit5(); dt, debug = false)
ERROR: InvalidIRError: compiling kernel atsit5_kernel(ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}) resulted in invalid LLVM IR
Reason: unsupported dynamic function invocation (call to >)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:247
Reason: unsupported dynamic function invocation (call to materialize)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to *)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to broadcasted)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:265
Reason: unsupported dynamic function invocation (call to min)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to max)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:279
Reason: unsupported dynamic function invocation (call to getindex)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:258
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:260
Reason: unsupported dynamic function invocation (call to afoldl)
Stacktrace:
 [1] +
   @ ./operators.jl:655
 [2] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:263
Reason: unsupported dynamic function invocation (call to ODE_DEFAULT_NORM)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:266
Reason: unsupported dynamic function invocation (call to iszero)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:268
Reason: unsupported use of an undefined name (use of 'qmax')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:269
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:269
Reason: unsupported dynamic function invocation (call to >)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:275
Reason: unsupported use of an undefined name (use of 'qmin')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:276
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:276
Reason: unsupported use of an undefined name (use of 'qmax')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to inv)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'qmin')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'gamma')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported dynamic function invocation (call to /)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'qoldinit')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:279
Reason: unsupported use of an undefined name (use of 'cur_t')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to <=)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:295
Reason: unsupported dynamic function invocation (call to max)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:278
Reason: unsupported use of an undefined name (use of 'beta1')
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:271
Reason: unsupported dynamic function invocation (call to ^)
Stacktrace:
 [1] atsit5_kernel
   @ ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:271
HINT: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget, CUDA.CUDACompilerParams, GPUCompiler.FunctionSpec{typeof(DiffEqGPU.atsit5_kernel), Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}}}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/validation.jl:119
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:327 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/nDhDw/src/TimerOutput.jl:252 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/I9fZc/src/driver.jl:325 [inlined]
  [5] emit_asm(job::GPUCompiler.CompilerJob, ir::LLVM.Module; strip::Bool, validate::Bool, format::LLVM.API.LLVMCodeGenFileType)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/utils.jl:64
  [6] cufunction_compile(job::GPUCompiler.CompilerJob)
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:326
  [7] cached_compilation(cache::Dict{UInt64, Any}, job::GPUCompiler.CompilerJob, compiler::typeof(CUDA.cufunction_compile), linker::typeof(CUDA.cufunction_link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/I9fZc/src/cache.jl:90
  [8] cufunction(f::typeof(DiffEqGPU.atsit5_kernel), tt::Type{Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}}; name::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:297
  [9] cufunction(f::typeof(DiffEqGPU.atsit5_kernel), tt::Type{Tuple{ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, CuDeviceVector{SVector{3, Float32}, 1}, CuDeviceMatrix{SVector{3, Float32}, 1}, CuDeviceMatrix{Float32, 1}, Float32, Float32, Float32, Val{false}, Val{true}}})
    @ CUDA ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:291
 [10] macro expansion
    @ ~/.julia/packages/CUDA/5jdFl/src/compiler/execution.jl:102 [inlined]
 [11] vectorized_asolve(prob::ODEProblem{SVector{3, Float32}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, ps::CuArray{SVector{3, Float32}, 1, CUDA.Mem.DeviceBuffer}, alg::GPUSimpleTsit5; dt::Float32, saveat::Nothing, save_everystep::Bool, abstol::Float32, reltol::Float32, debug::Bool)
    @ DiffEqGPU ~/.julia/dev/DiffEqGPU/src/gpu_tsit5.jl:91
 [12] top-level scope
    @ REPL[13]:1

MWE:

using DiffEqGPU, SimpleDiffEq, StaticArrays, CUDA
#using Plots

function loop(u, p, t)
    σ = p[1]; ρ = p[2]; β = p[3]
    du1 = σ*(u[2]-u[1])
    du2 = u[1]*(ρ-u[3]) - u[2]
    du3 = u[1]*u[2] - β*u[3]
    return SVector{3}(du1, du2, du3)
end
func = ODEFunction(loop)
u0 = 10ones(Float32,3)
su0= SVector{3}(u0)
dt = 1f-1
tspan = (0.0f0, 10.0f0)

prob = ODEProblem{false}(loop, SVector{3}(u0), (0.0f0, 10.0f0),  SA_F32[10, 28, 8/3])
# sol2 = solve(prob,GPUSimpleTsit5())

@info "GPU version"
N = 1000
ps = CuArray([@SVector [10f0,28f0,8/3f0] for i in 1:N])
gpu_sol = vectorized_solve(prob, ps, GPUSimpleTsit5(); dt, debug = false)

gpu_sol = vectorized_asolve(prob, ps, GPUSimpleTsit5(); dt, debug = false)

Could you give me some pointers (documentation or some example code?) on how to write an adaptive version and fix the above errors? Thank you!

maleadt commented 2 years ago

There's probably a type instability (which can also be caused by a typo) in your implementation, so try the suggestion in the error to run under Cthulhu. Alternatively, prefix the invocation with @device_code_warntype interactive=true.

utkarsh530 commented 2 years ago

The saveat functionality is working in the adaptive version. Log:

julia> using DiffEqGPU, SimpleDiffEq, StaticArrays, CUDA

julia> #using Plots

       function loop(u, p, t)
           σ = p[1]; ρ = p[2]; β = p[3]
           du1 = σ*(u[2]-u[1])
           du2 = u[1]*(ρ-u[3]) - u[2]
           du3 = u[1]*u[2] - β*u[3]
           return SVector{3}(du1, du2, du3)
       end
loop (generic function with 1 method)

julia> func = ODEFunction(loop)
(::ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}) (generic function with 7 methods)

julia> u0 = 10ones(Float64,3)
3-element Vector{Float64}:
 10.0
 10.0
 10.0

julia> su0= SVector{3}(u0)
3-element SVector{3, Float64} with indices SOneTo(3):
 10.0
 10.0
 10.0

julia> dt = 1f-1
0.1f0

julia> tspan = (0.0f0, 10.0f0)
(0.0f0, 10.0f0)

julia> prob = ODEProblem{false}(loop, SVector{3}(u0), (0.0f0, 10.0f0),  SA_F32[10, 28, 8/3])
ODEProblem with uType SVector{3, Float64} and tType Float32. In-place: false
timespan: (0.0f0, 10.0f0)
u0: 3-element SVector{3, Float64} with indices SOneTo(3):
 10.0
 10.0
 10.0

julia> # sol2 = solve(prob,GPUSimpleTsit5())

       @info "GPU version"
[ Info: GPU version

julia> N = 10
10

julia> ps = CuArray([@SVector [10.f0,28.f0,8/3f0] for i in 1:10])
10-element CuArray{SVector{3, Float32}, 1, CUDA.Mem.DeviceBuffer}:
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]
 [10.0, 28.0, 2.6666667]

julia> gpu_asol = vectorized_asolve(prob, ps, GPUSimpleATsit5(); dt, saveat = 0.0:0.1:1.0, debug = false)
10-element Vector{ODESolution}:
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[10.0, 10.0, 10.0], [14.939179399612234, 21.21584054774498, 26.228830924283788], [14.51451398040364, 6.097818481131051, 41.68494141740392], [4.573042134088737, -3.6573951745458366, 32.07245155323134], [-0.6628893344605427, -3.5903993329192176, 24.102225759179902], [-2.7273511916210076, -4.468727764250234, 19.097119415749088], [-4.902333774628836, -7.78738353404349, 16.6526400102735], [-8.836125860495535, -13.763465687428345, 19.33237322377555], [-13.305871172982945, -15.76759644231623, 30.61905129256846], [-11.434115073464367, -5.804090592244281, 36.18234142425265], [-5.555995501670298, -0.7991161813121691, 29.668759375139512]]), false, 0, nothing, :Default)
 ODESolution{Float64, 2, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}, Nothing, Nothing, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUSimpleATsit5, SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}, Nothing}(SVector{3, Float64}[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]], nothing, nothing, 0.0:0.1:1.0, nothing, ODEProblem{SVector{3, Float64}, Tuple{Float32, Float32}, false, SVector{3, Float32}, ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}(ODEFunction{false, typeof(loop), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing}(loop, LinearAlgebra.UniformScaling{Bool}(true), nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, nothing, SciMLBase.DEFAULT_OBSERVED, nothing), [10.0, 10.0, 10.0], (0.0f0, 10.0f0), Float32[10.0, 28.0, 2.6666667], Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}(), SciMLBase.StandardODEProblem()), GPUSimpleATsit5(), SciMLBase.LinearInterpolation{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}}(0.0:0.1:1.0, SVector{3, Float64}[[0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), false, 0, nothing, :Default)

julia> gpu_asol[1]
retcode: Default
Interpolation: ┌ Warning: Performing scalar indexing on task Task (runnable) @0x00007f39bf1482f0.
│ Invocation of getindex resulted in scalar indexing of a GPU array.
│ This is typically caused by calling an iterating implementation of a method.
│ Such implementations *do not* execute on the GPU, but very slowly on the CPU,
│ and therefore are only permitted from the REPL for prototyping purposes.
│ If you did intend to index this array, annotate the caller with @allowscalar.
└ @ GPUArrays ~/.julia/packages/GPUArrays/Zecv7/src/host/indexing.jl:56
1st order linear
t: 0.0:0.1:1.0
u: 11-element CuArray{SVector{3, Float64}, 1, CUDA.Mem.DeviceBuffer}:
 [10.0, 10.0, 10.0]
 [14.939179399612234, 21.21584054774498, 26.228830924283788]
 [14.51451398040364, 6.097818481131051, 41.68494141740392]
 [4.573042134088737, -3.6573951745458366, 32.07245155323134]
 [-0.6628893344605427, -3.5903993329192176, 24.102225759179902]
 [-2.7273511916210076, -4.468727764250234, 19.097119415749088]
 [-4.902333774628836, -7.78738353404349, 16.6526400102735]
 [-8.836125860495535, -13.763465687428345, 19.33237322377555]
 [-13.305871172982945, -15.76759644231623, 30.61905129256846]
 [-11.434115073464367, -5.804090592244281, 36.18234142425265]
 [-5.555995501670298, -0.7991161813121691, 29.668759375139512]
utkarsh530 commented 2 years ago

@maleadt, is there a reason you've wrapped this https://gist.github.com/maleadt/3a848e5743e6001116666f4c3c68a061#file-gpu_ode-jl-L239-L242 instead of directly calling vectorized_solve? I am directly calling vectorized_asolve for solving the ODE. Could you also verify the saveat code that I've written is alright? https://github.com/SciML/DiffEqGPU.jl/pull/148/files#diff-930c7186b2b6826d7462c29f83229942810af75edbc47f5b5e69976c6656db79R316-R325

utkarsh530 commented 2 years ago

MWE:

using DiffEqGPU, OrdinaryDiffEq, StaticArrays
function lorenz(u, p, t)
    σ = p[1]
    ρ = p[2]
    β = p[3]
    du1 = σ * (u[2] - u[1])
    du2 = u[1] * (ρ - u[3]) - u[2]
    du3 = u[1] * u[2] - β * u[3]
    return SVector{3}(du1, du2, du3)
end

u0 = @SVector [1.f0;0.f0;0.f0]
tspan = (0.0f0,100.0f0)
p = @SVector [10.0f0,28.0f0,8/3f0]
prob = ODEProblem{false}(lorenz, u0, tspan, p)
prob_func = (prob,i,repeat) -> remake(prob,p=rand(Float32,3).*p)
monteprob = EnsembleProblem(prob, prob_func = prob_func, safetycopy=false)
sol = solve(monteprob,Tsit5(),EnsembleGPUArrayAutonomous(),trajectories=10,saveat = 0.f0:1.f0:10.f0)
utkarsh530 commented 2 years ago

@ChrisRackauckas any comments on the name EnsembleGPUArrayAutonomous?

ChrisRackauckas commented 2 years ago

EnsembleGPUAutonomous, there's no array-ification here.

utkarsh530 commented 2 years ago

@ChrisRackauckas LGTM. We can merge it and make new PRs for improvements.

ChrisRackauckas commented 2 years ago

Can you get tests passing first? Might need @vchuravy to help out. Something broke in KernelAbstractions and I don't know when/

utkarsh530 commented 2 years ago

@YingboMa https://buildkite.com/julialang/diffeqgpu-dot-jl/builds/160#57fb53d6-5115-42d6-af47-65e6c0067165/553-966

vchuravy commented 2 years ago

Something broke in KernelAbstractions and I don't know when

I would need a small MWE to dig in, I don't currently have the time to dive into everything here

ChrisRackauckas commented 2 years ago

Same failure in a different part of the code. Looks like CUDA.jl or KernelAbstractions.jl has issues with the gpuci drivers or something? We can follow up in another issue on this.