SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
272 stars 27 forks source link

Jacobian with respecpect to paramters fails #295

Open ffuhrer opened 10 months ago

ffuhrer commented 10 months ago

Hi all, calculating derivates wrt to the ode parameters of a gpu ensemble of odes fails when the parameters are a CuArray. Solvign the odes works, also ebverything works on the CPU and if the parameters are "normal" Array. It seems that julia or aone of the packages thinks that the paramters hould a an Array and not a a CuArray. For may aplication I need trhe parameters to be a CuArray, they are the outcome of compute heavy calculation on the GPU.

Anyone an idea how to fix this?

Here, a minmal example: `import Pkg using DiffEqGPU, CUDA, OrdinaryDiffEq, Zygote, SciMLSensitivity

function lorenz!(du, u, p, t) σ = p[1] ρ = p[2] β = p[3] du[1] = σ (u[2] - u[1]) du[2] = u[1] (ρ - u[3]) - u[2] du[3] = u[1] u[2] - β u[3] return du end

u0 = cu([[1.0f0; 0.0f0; 0.0f0] [1.0f0; 0.0f0; 0.0f0]]) tspan = (0.0f0, 10.0f0) p =cu([[10.0f0, 28.0f0, 8 / 3.0f0] [10.0f0, 28.0f0, 8 / 3.0f0]])

prob = ODEProblem(lorenz!, cu([0.0f0; 0.0f0; 0.0f0]), tspan, cu([0.0f0; 0.0f0; 0.0f0]))

function func(params, initialU) prob_func = (prob, i, repeat) -> remake(prob, p=params[:,i], u0=initialU[:, i]) monteprob = EnsembleProblem(prob, prob_func=prob_func, safetycopy=false) solve(monteprob, Tsit5(), EnsembleGPUArray(0.0), trajectories=2, saveat=1.0f0, sensealg=ForwardDiffSensitivity()) end

print(func(p, u0)) #Works

print(Zygote.jacobian(params -> func(params, u0), p)) #fails if p is a CUDA array`

My envioroment, julia 1.9.0:

[052768ef] CUDA v4.4.0 [071ae1c0] DiffEqGPU v2.4.1 [1dea7af3] OrdinaryDiffEq v6.53.4 [1ed8b502] SciMLSensitivity v7.36.0 [e88e6eb3] Zygote v0.6.63

first few lines of the error messages, remiander appended ERROR: LoadError: GPU compilation of MethodInstance for (::GPUArrays.var"#broadcast_kernel#26")(::CUDA.CuKernelContext, ::CuDeviceVector{Float32, 1}, ::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(Zygote.accum), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{Vector{Float32}, Tuple{Bool}, Tuple{Int64}}}}, ::Int64) failed KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1}, Tuple{Base.OneTo{Int64}}, typeof(Zygote.accum), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{Vector{Float32}, Tuple{Bool}, Tuple{Int64}}}}, which is not isbits: .args is of type Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}, Base.Broadcast.Extruded{Vector{Float32}, Tuple{Bool}, Tuple{Int64}}} which is not isbits. .2 is of type Base.Broadcast.Extruded{Vector{Float32}, Tuple{Bool}, Tuple{Int64}} which is not isbits. .x is of type Vector{Float32} which is not isbits.

jac_error_gpu.txt

ChrisRackauckas commented 10 months ago

@utkarsh530 can you see if it's hitting this rrule? https://github.com/SciML/DiffEqGPU.jl/blob/v2.4.1/src/DiffEqGPU.jl#L1141-L1205