SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
279 stars 29 forks source link

Simple SDE only works with one trajectory in ensemble #140

Open emprice opened 2 years ago

emprice commented 2 years ago

Based on the DiffEq documentation, I've written this minimal example. If I set trajectories = 1, it compiles properly and everything is fine. But when trajectories = 32, say, I get errors relating to calling overdub from within the kernel. I'm very much a Julia noob, but this seems weird to me. The CUDA debugging settings aren't really helping shed any light on the situation. I've been working on this all day and can't seem to make forward progress. Hoping to get some help here!

Example code:

using CUDA
using Plots
using DifferentialEquations
using DiffEqGPU

n = 16

function μ_cox(du, u, p, t)
    du .= -1.0f0 * u
    return nothing
end

function σ_cox(du, u, p, t)
    du .= 0.1f0 * u .^ 2;
    return nothing
end

params = CUDA.zeros(Float32, 1)
initial = CUDA.ones(Float32, n)
prob = SDEProblem(μ_cox, σ_cox, initial, (0.0f0, 1.0f1), p = params)

ensembleprob = EnsembleProblem(prob, safetycopy = false)
sol = solve(ensembleprob, Tsit5(), EnsembleGPUArray(), trajectories = 32)

Example error:

InvalidIRError: compiling kernel gpu_gpu_kernel(Cassette.Context{nametype(CUDACtx), KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, Nothing, KernelAbstractions.var"##PassType#257", Nothing, Cassette.DisableHooks}, typeof(DiffEqGPU.gpu_gpu_kernel), typeof(μ_cox), CuDeviceMatrix{Float32, 1}, CuDeviceMatrix{Float32, 1}, CuDeviceMatrix{SciMLBase.NullParameters, 1}, Float32) resulted in invalid LLVM IR
Reason: unsupported call to the Julia runtime (call to jl_f_tuple)
Stacktrace:
 [1] overdub
   @ ~/.julia/packages/Cassette/r4kKQ/src/overdub.jl:635
 [2] throw_inexacterror(::Symbol, ::Type{UInt64}, ::Int64)
   @ ./boot.jl:602
 [3] overdub
   @ ./boot.jl:602
 [4] overdub
   @ ~/.julia/packages/Cassette/r4kKQ/src/overdub.jl:0
 [5] multiple call sites
   @ unknown:0
ChrisRackauckas commented 2 years ago

KernelAbstractions.jl cannot generate kernels with broadcast expressions in them.

emprice commented 2 years ago

You might need to help me a little here, @ChrisRackauckas. du and u are the same size, so there shouldn't be any broadcasting (from what I usually think of as broadcasting, anyway). And how is that related to the number of trajectories?

ChrisRackauckas commented 2 years ago

At 1 trajectory it has a fallback to just serial.

du and u are the same size, so there shouldn't be any broadcasting (from what I usually think of as broadcasting, anyway)

Yes. https://github.com/SciML/DiffEqGPU.jl/issues/71