SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
284 stars 29 forks source link

SDEs with non-diagonal noise cause EnsembleGPU/CPUArray to throw an error or return an incorrect solution #331

Open henhen724 opened 2 months ago

henhen724 commented 2 months ago

Describe the bug ๐Ÿž

Expected behavior The expected behavior is for the solver to finish without throwing an error and return an accurate solution.

Minimal Reproducible Example ๐Ÿ‘‡

using DifferentialEquations, DiffEqGPU, SparseArrays

function lorenz(du, u, p, t)
    du[1] = p[1] * (u[2] - u[1])
    du[2] = u[1] * (p[2] - u[3]) - u[2]
    du[3] = u[1] * u[2] - p[3] * u[3]
    du[4] = 0
end

function multiplicative_noise(du, u, p, t)
    du[1, 1] = 0.1
    du[2, 2] = 0.4
    du[4, 1] = 1.0
end

NRate = spzeros(4, 2)
NRate[1, 1] = 1
NRate[4, 1] = 1
NRate[2, 2] = 1

u0 = ComplexF32[1.0; 0.0; 0.0; 0.0]
tspan = (0.0f0, 10.0f0)
p = (10.0f0, 28.0f0, 8 / 3.0f0)
prob = SDEProblem(lorenz, multiplicative_noise, u0, tspan, p, noise_rate_prototype=NRate)

prob_func = (prob, i, repeat) -> remake(prob, p=p)
monteprob = EnsembleProblem(prob, prob_func=prob_func)

sol = solve(monteprob, SRA1(), EnsembleCPUArray(), trajectories=10_000, saveat=1.0f0)

Error & Stacktrace โš ๏ธ

ERROR: LoadError: BoundsError: attempt to access 4-element view(::Matrix{ComplexF32}, :, 1) with eltype ComplexF32 at index [2, 2]
Stacktrace:
  [1] throw_boundserror(A::SubArray{ComplexF32, 1, Matrix{ComplexF32}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, I::Tuple{Int64, Int64})
    @ Base .\abstractarray.jl:737
  [2] checkbounds
    @ .\abstractarray.jl:702 [inlined]
  [3] _setindex!
    @ .\abstractarray.jl:1418 [inlined]
  [4] setindex!
    @ .\abstractarray.jl:1396 [inlined]
  [5] multiplicative_noise
    @ Z:\Users\hshunt\LabNotebooks\DickeModel\ArraySolveTesting.jl:13 [inlined]
  [6] macro expansion
    @ C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\ensemblegpuarray\kernels.jl:45 [inlined]
  [7] cpu_gpu_kernel
    @ C:\Users\henhen724\.julia\packages\KernelAbstractions\MAxUm\src\macros.jl:287 [inlined]
  [8] cpu_gpu_kernel(__ctx__::KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, CartesianIndex{1}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, f::typeof(multiplicative_noise), du::Matrix{ComplexF32}, u::Matrix{ComplexF32}, p::Matrix{Tuple{Float32, Float32, Float32}}, t::Float32)
    @ DiffEqGPU .\none:0
  [9] __thread_run(tid::Int64, len::Int64, rem::Int64, obj::KernelAbstractions.Kernel{KernelAbstractions.CPU, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, typeof(DiffEqGPU.cpu_gpu_kernel)}, ndrange::Tuple{Int64}, iterspace::KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}, args::Tuple{typeof(multiplicative_noise), Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{Tuple{Float32, Float32, Float32}}, Float32}, dynamic::KernelAbstractions.NDIteration.DynamicCheck)
    @ KernelAbstractions C:\Users\henhen724\.julia\packages\KernelAbstractions\MAxUm\src\cpu.jl:117
 [10] __run(obj::KernelAbstractions.Kernel{KernelAbstractions.CPU, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, typeof(DiffEqGPU.cpu_gpu_kernel)}, ndrange::Tuple{Int64}, iterspace::KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}, args::Tuple{typeof(multiplicative_noise), Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{Tuple{Float32, Float32, Float32}}, Float32}, dynamic::KernelAbstractions.NDIteration.DynamicCheck, static_threads::Bool)
    @ KernelAbstractions C:\Users\henhen724\.julia\packages\KernelAbstractions\MAxUm\src\cpu.jl:84
 [11] (::KernelAbstractions.Kernel{KernelAbstractions.CPU, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, typeof(DiffEqGPU.cpu_gpu_kernel)})(::Function, ::Vararg{Any}; ndrange::Int64, workgroupsize::Int64)
    @ KernelAbstractions C:\Users\henhen724\.julia\packages\KernelAbstractions\MAxUm\src\cpu.jl:46
 [12] Kernel
    @ C:\Users\henhen724\.julia\packages\KernelAbstractions\MAxUm\src\cpu.jl:39 [inlined]
 [13] #21
    @ C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\ensemblegpuarray\problem_generation.jl:85 [inlined]       
 [14] sde_determine_initdt(u0::Matrix{ComplexF32}, t::Float32, tdir::Float32, dtmax::Float32, abstol::Float32, reltol::Float32, internalnorm::typeof(DiffEqGPU.diffeqgpunorm), prob::SDEProblem{Matrix{ComplexF32}, Tuple{Float32, Float32}, true, Matrix{Tuple{Float32, Float32, Float32}}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, @Kwargs{}, Nothing}, order::Rational{Int64}, integrator::StochasticDiffEq.SDEIntegrator{SRA1, true, Matrix{ComplexF32}, ComplexF32, Float32, Float32, Matrix{Tuple{Float32, Float32, Float32}}, Float32, Float32, ComplexF32, NoiseProcess{ComplexF32, 3, Float32, Matrix{ComplexF32}, Matrix{ComplexF32}, Vector{Matrix{ComplexF32}}, typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST), typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE), Nothing, true, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, 
ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, RSWM{Float64}, Nothing, RandomNumbers.Xorshifts.Xoroshiro128Plus}, Nothing, Matrix{ComplexF32}, RODESolution{ComplexF32, 3, Vector{Matrix{ComplexF32}}, Nothing, Nothing, Vector{Float32}, NoiseProcess{ComplexF32, 3, Float32, Matrix{ComplexF32}, Matrix{ComplexF32}, Vector{Matrix{ComplexF32}}, typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST), typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE), Nothing, true, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, RSWM{Float64}, Nothing, RandomNumbers.Xorshifts.Xoroshiro128Plus}, SDEProblem{Matrix{ComplexF32}, Tuple{Float32, Float32}, true, Matrix{Tuple{Float32, Float32, Float32}}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, @Kwargs{}, Nothing}, SRA1, StochasticDiffEq.LinearInterpolationData{Vector{Matrix{ComplexF32}}, Vector{Float32}}, SciMLBase.DEStats, Nothing}, StochasticDiffEq.SRA1Cache{Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{ComplexF32}}, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, Nothing, StochasticDiffEq.SDEOptions{Float32, Float32, PIController{Float32}, typeof(DiffEqGPU.diffeqgpunorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), DiffEqGPU.var"#114#120", DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, Nothing, Nothing, Int64, Float32, Float32, ComplexF32, Tuple{}, Float32, Tuple{}}, Nothing, ComplexF32, Nothing, Nothing})
    @ StochasticDiffEq C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\initdt.jl:34
 [15] auto_dt_reset!
    @ C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\integrators\integrator_interface.jl:355 [inlined]  
 [16] handle_dt!(integrator::StochasticDiffEq.SDEIntegrator{SRA1, true, Matrix{ComplexF32}, ComplexF32, Float32, Float32, Matrix{Tuple{Float32, Float32, Float32}}, Float32, Float32, ComplexF32, NoiseProcess{ComplexF32, 3, Float32, Matrix{ComplexF32}, Matrix{ComplexF32}, Vector{Matrix{ComplexF32}}, typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST), typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE), Nothing, true, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, RSWM{Float64}, Nothing, RandomNumbers.Xorshifts.Xoroshiro128Plus}, Nothing, Matrix{ComplexF32}, RODESolution{ComplexF32, 3, Vector{Matrix{ComplexF32}}, Nothing, Nothing, Vector{Float32}, NoiseProcess{ComplexF32, 3, Float32, Matrix{ComplexF32}, Matrix{ComplexF32}, Vector{Matrix{ComplexF32}}, typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_DIST), typeof(DiffEqNoiseProcess.INPLACE_WHITE_NOISE_BRIDGE), Nothing, true, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, ResettableStacks.ResettableStack{Tuple{Float32, Matrix{ComplexF32}, Matrix{ComplexF32}}, true}, RSWM{Float64}, Nothing, RandomNumbers.Xorshifts.Xoroshiro128Plus}, SDEProblem{Matrix{ComplexF32}, Tuple{Float32, Float32}, true, Matrix{Tuple{Float32, Float32, Float32}}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, @Kwargs{}, Nothing}, SRA1, StochasticDiffEq.LinearInterpolationData{Vector{Matrix{ComplexF32}}, Vector{Float32}}, SciMLBase.DEStats, Nothing}, StochasticDiffEq.SRA1Cache{Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{ComplexF32}, Matrix{ComplexF32}}, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, 
Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, Nothing, StochasticDiffEq.SDEOptions{Float32, Float32, PIController{Float32}, typeof(DiffEqGPU.diffeqgpunorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), DiffEqGPU.var"#114#120", DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float32, DataStructures.FasterForward}, Nothing, Nothing, Int64, Float32, Float32, ComplexF32, Tuple{}, Float32, Tuple{}}, Nothing, ComplexF32, Nothing, Nothing})
    @ StochasticDiffEq C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\solve.jl:643
 [17] __init(_prob::SDEProblem{Matrix{ComplexF32}, Tuple{Float32, Float32}, true, Matrix{Tuple{Float32, Float32, Float32}}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, @Kwargs{}, Nothing}, alg::SRA1, timeseries_init::Vector{Any}, ts_init::Vector{Any}, ks_init::Type, recompile::Type{Val{true}}; saveat::Float32, tstops::Tuple{}, d_discontinuities::Tuple{}, save_idxs::Nothing, save_everystep::Bool, 
save_noise::Bool, save_on::Bool, save_start::Bool, save_end::Nothing, callback::Nothing, dense::Bool, calck::Bool, dt::Float32, adaptive::Bool, gamma::Rational{Int64}, abstol::Nothing, reltol::Nothing, qmin::Rational{Int64}, qmax::Rational{Int64}, qsteady_min::Int64, qsteady_max::Int64, beta2::Nothing, beta1::Nothing, qoldinit::Rational{Int64}, controller::Nothing, fullnormalize::Bool, failfactor::Int64, delta::Rational{Int64}, maxiters::Int64, dtmax::Float32, dtmin::Float32, internalnorm::typeof(DiffEqGPU.diffeqgpunorm), isoutofdomain::typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), unstable_check::DiffEqGPU.var"#114#120", verbose::Bool, force_dtmin::Bool, timeseries_errors::Bool, dense_errors::Bool, advance_to_tstop::Bool, stop_at_next_tstop::Bool, initialize_save::Bool, progress::Bool, progress_steps::Int64, progress_name::String, progress_message::typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), progress_id::Symbol, userdata::Nothing, initialize_integrator::Bool, seed::UInt64, alias_u0::Bool, alias_jumps::Bool, kwargs::@Kwargs{})
    @ StochasticDiffEq C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\solve.jl:596
 [18] __init (repeats 2 times)
    @ C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\solve.jl:18 [inlined]
 [19] #__solve#107
    @ C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\solve.jl:6 [inlined]
 [20] __solve (repeats 4 times)
    @ C:\Users\henhen724\.julia\packages\StochasticDiffEq\PgPd0\src\solve.jl:1 [inlined]
 [21] solve_call(_prob::SDEProblem{Matrix{ComplexF32}, Tuple{Float32, Float32}, true, Matrix{Tuple{Float32, Float32, Float32}}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, DiffEqGPU.var"#20#25"{typeof(lorenz), typeof(DiffEqGPU.gpu_kernel)}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, DiffEqGPU.var"#21#26"{typeof(multiplicative_noise), typeof(DiffEqGPU.gpu_kernel)}, @Kwargs{}, Nothing}, args::SRA1; merge_callbacks::Bool, kwargshandle::Nothing, kwargs::@Kwargs{adaptive::Bool, unstable_check::DiffEqGPU.var"#114#120", saveat::Float32, callback::Nothing, internalnorm::typeof(DiffEqGPU.diffeqgpunorm)})
    @ DiffEqBase C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:612
 [22] solve_call
    @ C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:569 [inlined]
 [23] #solve_up#53
    @ C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:1080 [inlined]
 [24] solve_up
    @ C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:1066 [inlined]
 [25] #solve#51
    @ C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:1003 [inlined]
 [26] batch_solve_up(ensembleprob::EnsembleProblem{SDEProblem{Vector{ComplexF32}, Tuple{Float32, Float32}, true, Tuple{Float32, Float32, Float32}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, typeof(lorenz), typeof(multiplicative_noise), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, typeof(multiplicative_noise), @Kwargs{}, SparseMatrixCSC{Float64, Int64}}, var"#3#4", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, probs::Vector{SDEProblem{Vector{ComplexF32}, Tuple{Float32, Float32}, true, Tuple{Float32, Float32, Float32}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, typeof(lorenz), typeof(multiplicative_noise), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, 
typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, typeof(multiplicative_noise), @Kwargs{}, SparseMatrixCSC{Float64, Int64}}}, alg::SRA1, ensemblealg::EnsembleCPUArray, I::UnitRange{Int64}, u0::Matrix{ComplexF32}, p::Matrix{Tuple{Float32, Float32, Float32}}; kwargs::@Kwargs{adaptive::Bool, unstable_check::DiffEqGPU.var"#114#120", saveat::Float32})    
    @ DiffEqGPU C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\solve.jl:315
 [27] batch_solve(ensembleprob::EnsembleProblem{SDEProblem{Vector{ComplexF32}, Tuple{Float32, Float32}, true, Tuple{Float32, Float32, Float32}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, typeof(lorenz), typeof(multiplicative_noise), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, typeof(multiplicative_noise), @Kwargs{}, SparseMatrixCSC{Float64, Int64}}, var"#3#4", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), 
Nothing}, alg::SRA1, ensemblealg::EnsembleCPUArray, I::UnitRange{Int64}, adaptive::Bool; kwargs::@Kwargs{unstable_check::DiffEqGPU.var"#114#120", saveat::Float32})
    @ DiffEqGPU C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\solve.jl:242
 [28] macro expansion
    @ .\timing.jl:395 [inlined]
 [29] __solve(ensembleprob::EnsembleProblem{SDEProblem{Vector{ComplexF32}, Tuple{Float32, Float32}, true, Tuple{Float32, Float32, Float32}, Nothing, SDEFunction{true, SciMLBase.FullSpecialize, typeof(lorenz), typeof(multiplicative_noise), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, typeof(multiplicative_noise), @Kwargs{}, SparseMatrixCSC{Float64, Int64}}, var"#3#4", typeof(SciMLBase.DEFAULT_OUTPUT_FUNC), typeof(SciMLBase.DEFAULT_REDUCTION), Nothing}, alg::SRA1, ensemblealg::EnsembleCPUArray; trajectories::Int64, batch_size::Int64, unstable_check::Function, adaptive::Bool, kwargs::@Kwargs{saveat::Float32})
    @ DiffEqGPU C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\solve.jl:55
 [30] __solve
    @ C:\Users\henhen724\.julia\packages\DiffEqGPU\I999k\src\solve.jl:1 [inlined]
 [31] #solve#55
    @ C:\Users\henhen724\.julia\packages\DiffEqBase\c8MAQ\src\solve.jl:1096 [inlined]
 [32] top-level scope
    @ Z:\Users\hshunt\LabNotebooks\DickeModel\ArraySolveTesting.jl:32
in expression starting at Z:\Users\hshunt\LabNotebooks\DickeModel\ArraySolveTesting.jl:32

Environment (please complete the following information):

julia> using Pkg; Pkg.status()
Status `\\levlabserver2.stanford.edu\commondrive\Users\hshunt\LabNotebooks\BugEnv\Project.toml`
  [071ae1c0] DiffEqGPU v3.4.1
  [0c46a032] DifferentialEquations v7.13.0
  [2f01184e] SparseArrays v1.10.0
julia> using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)
Status `\\levlabserver2.stanford.edu\commondrive\Users\hshunt\LabNotebooks\BugEnv\Manifest.toml`
  [47edcb42] ADTypes v1.6.1
โŒƒ [7d9f7c33] Accessors v0.1.36
  [79e6a3ab] Adapt v4.0.4
  [66dad0bd] AliasTables v1.1.3
  [ec485272] ArnoldiMethod v0.4.0
  [4fba245c] ArrayInterface v7.12.0
  [4c555306] ArrayLayouts v1.10.2
  [a9b6321e] Atomix v0.1.0
  [aae01518] BandedMatrices v1.7.2
  [62783981] BitTwiddlingConvenienceFunctions v0.1.6
  [764a87c0] BoundaryValueDiffEq v5.9.0
  [fa961155] CEnum v0.5.0
  [2a0fbf3d] CPUSummary v0.2.6
  [49dc2e85] Calculus v0.5.1
  [d360d2e6] ChainRulesCore v1.24.0
  [fb6a15b2] CloseOpenIntervals v0.1.13
  [38540f10] CommonSolve v0.2.4
  [bbf7d656] CommonSubexpressions v0.3.0
  [f70d9fcc] CommonWorldInvalidations v1.0.0
  [34da2185] Compat v4.15.0
  [a33af91c] CompositionsBase v0.1.2
  [2569d6c7] ConcreteStructs v0.2.3
โŒƒ [187b0558] ConstructionBase v1.5.5
  [adafc99b] CpuId v0.3.1
  [9a962f9c] DataAPI v1.16.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [bcd4f6db] DelayDiffEq v5.47.3
  [2b5f629d] DiffEqBase v6.151.5
  [459566f4] DiffEqCallbacks v3.6.2
  [071ae1c0] DiffEqGPU v3.4.1
  [77a26b50] DiffEqNoiseProcess v5.22.0
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
  [0c46a032] DifferentialEquations v7.13.0
  [a0c0ee7d] DifferentiationInterface v0.5.9
  [b4f34e82] Distances v0.10.11
  [31c24e10] Distributions v0.25.109
  [ffbed154] DocStringExtensions v0.9.3
  [fa6b7ba4] DualNumbers v0.6.8
  [4e289a0a] EnumX v1.0.4
  [f151be2c] EnzymeCore v0.7.7
  [d4d017d3] ExponentialUtilities v1.26.1
  [e2ba6199] ExprTools v0.1.10
  [9d29842c] FastAlmostBandedMatrices v0.1.3
โŒƒ [7034ab61] FastBroadcast v0.3.4
  [9aa1b823] FastClosures v0.3.2
  [29a986be] FastLapackInterface v2.0.4
  [1a297f60] FillArrays v1.11.0
  [6a86dc24] FiniteDiff v2.23.1
  [f6369f11] ForwardDiff v0.10.36
  [069b7b12] FunctionWrappers v1.1.3
  [77dc65aa] FunctionWrappersWrappers v0.1.3
  [d9f16b24] Functors v0.4.11
  [46192b85] GPUArraysCore v0.1.6
  [c145ed77] GenericSchur v0.5.4
  [86223c79] Graphs v1.11.2
  [3e5b6fbb] HostCPUFeatures v0.1.17
  [34004b35] HypergeometricFunctions v0.3.23
  [615f187c] IfElse v0.1.1
  [d25df0c9] Inflate v0.1.5
  [3587e190] InverseFunctions v0.1.15
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [692b3bcd] JLLWrappers v1.5.0
  [ccbc3e58] JumpProcesses v9.11.1
  [ef3ab10e] KLU v0.6.0
  [63c18a36] KernelAbstractions v0.9.22
  [ba0b0d4f] Krylov v0.9.6
  [929cbde3] LLVM v8.0.0
  [10f19ff3] LayoutPointers v0.1.17
  [5078a376] LazyArrays v2.1.9
  [2d8b4e74] LevyArea v1.0.0
  [d3d80556] LineSearches v7.2.0
  [7ed4a6bd] LinearSolve v2.30.2
  [2ab3a3ac] LogExpFunctions v0.3.28
  [bdcacae8] LoopVectorization v0.12.171
  [1914dd2f] MacroTools v0.5.13
  [d125e4d3] ManualMemory v0.1.8
  [a3b82374] MatrixFactorizations v3.0.0
  [bb5d69b7] MaybeInplace v0.1.3
  [e1d29d7a] Missings v1.2.0
  [46d2c3a1] MuladdMacro v0.2.4
  [d41bc354] NLSolversBase v7.8.3
  [2774e3e8] NLsolve v4.5.1
  [77ba4419] NaNMath v1.0.2
  [8913a72c] NonlinearSolve v3.13.1
  [6fe1bfb0] OffsetArrays v1.14.1
  [429524aa] Optim v1.9.4
  [bac558e1] OrderedCollections v1.6.3
  [1dea7af3] OrdinaryDiffEq v6.86.0
  [90014a1f] PDMats v0.11.31
  [65ce6f38] PackageExtensionCompat v1.0.2
  [d96e819e] Parameters v0.12.3
  [e409e4f3] PoissonRandom v0.4.4
  [f517fe37] Polyester v0.7.15
  [1d0040c9] PolyesterWeave v0.2.2
  [85a6dd25] PositiveFactorizations v0.2.4
  [d236fae5] PreallocationTools v0.4.22
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [43287f4e] PtrArrays v1.2.0
  [1fd47b50] QuadGK v2.9.4
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.5.3
  [3cdcf5f2] RecipesBase v1.3.4
  [731186ca] RecursiveArrayTools v3.26.0
  [f2c3362d] RecursiveFactorization v0.2.23
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [ae5879a3] ResettableStacks v1.1.1
  [79098fc4] Rmath v0.7.1
  [7e49a35a] RuntimeGeneratedFunctions v0.5.13
  [94e857df] SIMDTypes v0.1.0
  [476501e8] SLEEFPirates v0.6.43
โŒƒ [0bca4576] SciMLBase v2.43.1
  [c0aeaf25] SciMLOperators v0.3.8
  [53ae85a6] SciMLStructures v1.4.1
  [efcf1570] Setfield v1.1.1
  [05bca326] SimpleDiffEq v1.11.1
โŒƒ [727e6d20] SimpleNonlinearSolve v1.10.1
  [699a6c99] SimpleTraits v0.9.4
  [ce78b400] SimpleUnPack v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [47a9eef4] SparseDiffTools v2.19.0
  [0a514795] SparseMatrixColorings v0.3.5
  [e56a9233] Sparspak v0.3.9
  [276daf66] SpecialFunctions v2.4.0
  [aedffcd0] Static v1.1.1
  [0d7ed370] StaticArrayInterface v1.5.1
  [90137ffa] StaticArrays v1.9.7
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.1
  [9672c7b4] SteadyStateDiffEq v2.2.0
  [789caeaf] StochasticDiffEq v6.66.0
  [7792a7ef] StrideArraysCore v0.5.7
  [c3572dad] Sundials v4.24.0
  [2efcf032] SymbolicIndexingInterface v0.3.26
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [8290d209] ThreadingUtilities v0.5.2
  [a759f4b9] TimerOutputs v0.5.24
  [d5829a12] TriangularSolve v0.2.1
  [410a4b4d] Tricks v0.1.8
  [781d530d] TruncatedStacktraces v1.4.0
  [3a884ed6] UnPack v1.0.2
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.1.5
  [3d5dd08c] VectorizationBase v0.21.70
  [19fa3120] VertexSafeGraphs v0.2.0
  [700de1a5] ZygoteRules v0.2.5
  [1d5cc7b8] IntelOpenMP_jll v2024.2.0+0
  [dad2f222] LLVMExtra_jll v0.0.30+0
  [856f044c] MKL_jll v2024.2.0+0
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [f50d1b31] Rmath_jll v0.4.2+0
โŒ… [fb77eaff] Sundials_jll v5.2.2+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [1a1011a3] SharedArrays
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.0+0
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.8.0+1
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with โŒƒ and โŒ… have new versions available. Those with โŒƒ may be upgradable, but those with โŒ… are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`
julia> versioninfo()
Julia Version 1.10.2
Commit bd47eca2c8 (2024-03-01 10:14 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 ร— Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

Additional context This error is caused by the assumption in the file src > ensemblegpuarray > kernels.jl that the time series for du can be written as a matrix (with one index for ODE coordinate and the next index for time). When a problem has non-diagonal noise you need two coordinate indexes and a time index, so the du time series needs to be a three index tensor or include some flatten and resize adaptor when evaluating the noise function.

henhen724 commented 2 months ago

I made an attempt to add support for non-diagonal noise in a fork. Ultimately, I concluded that it would require refactoring other other packages, namely StochasticDiffEq.jl, which seems like too large a change for a pull request. https://github.com/henhen724/DiffEqGPU.jl/tree/non_diagonal_array_ensembles Instead, I'll leave what I learned here.

The problem with implementing non-diagonal noise in ArrayEnsembles is that the solve functions expects to be able to matrix multiply the noise term by a vector of Wiener increments and add that to the solution, but in ArrayEnsembles the solution is a matrix and not a vector. Here is a link to an arbitrarily chosen SDE algorithm: https://github.com/SciML/StochasticDiffEq.jl/blob/0c03d8b6378f133a1f819ebc2be8f0bee5d69f06/src/perform_step/sra.jl#L144

integrator.g(g1,uprev,p,t+c11*dt)
integrator.f(k1,uprev,p,t)

if is_diagonal_noise(integrator.sol.prob)
    @.. H01 = uprev + dt*a21*k1 + chi2*b21*g1
else
    mul!(Eโ‚,g1,chi2)
    @.. H01 = uprev + dt*a21*k1 + b21*Eโ‚
end

integrator.g is the noise function. It's output is stored in g1 which is then matrix multiplied by chi2 which is the vector of Wiener increaments. This means the expression b21*E is a vector. This of course makes sense for single copies of the problem.

The problem is that H01 and k1 which would both be vectors for an individual problem are nxm matrices in the ArrayEnsembles where n is the dimension of the underlying problem and m is the number of trajectories.

There is then a second larger problem which is that StochasticDiffEq.jl assumes that noise_rate_prototype is a matrix and sets the dimensions of the g1 matrix handed to integrator.g according to the dimensions of noise_rate_prototype. If noise_rate_prototype is a 3 tensor instead of matrix, then the solve function in StochasticDiffEq.jl will throw an error when trying to find the dimension for the Wiener process: https://github.com/SciML/StochasticDiffEq.jl/blob/0c03d8b6378f133a1f819ebc2be8f0bee5d69f06/src/solve.jl#L302

rand_prototype = false .* noise_rate_prototype[1,:]

BoundsError: attempt to access 4ร—2ร—10 Array{Float32, 3} at index [1, 1:2]

As far as I can see, the solution would be to refactor StochasticDiffEq.jl to allow non-matrix "noise_rate_prototype"s, and then add a definition of mul! for 3 tensor to the KernelAbstractions.jl library, as well as the changes in my fork to DiffEqGPU.jl.