SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
272 stars 27 forks source link

GPU locks up at full usage with EnsembleGPUKernel on tspan ensemble study #247

Open derdotte opened 1 year ago

derdotte commented 1 year ago

The following code locks up a 1080 strix according to the GPU Tweak III software.

using DiffEqGPU, DifferentialEquations, StaticArrays

function sys_gpu!(u, params, t)
    du1 = params[1] 
    du2 = params[2]
    return SVector{2}(du1,du2)
end 

function plateu_cycle_study_gpu()
    plateu_cycle::Float32 = 8.0f0
    w::Float32 = 0.34888f0
    tstart::Float32 = 0.0f0

    tend::Float32 = 2.0f0pi/w * (plateu_cycle+1.0f0)+1.0f0
    tspan = (tstart, tend) 

    params= @SVector [w, plateu_cycle]
    f0=1.0f0
    g0=1.0f0
    init_cond = SVector{2,Float32}(f0, g0)
    prob = ODEProblem(sys_gpu!,init_cond,tspan, params)

    plateu_cycle_end = 10.0f0
    amount = 1000
    plateu_cycle_study_values = collect(range(zero(Float32), plateu_cycle_end, length=amount))

    new_tend =  @. 2.0f0pi/w * (plateu_cycle_study_values+1.0f0)+1.0f0
    new_tstart = zeros(Float32, size(new_tend))

    function prob_func(prob, i, repeat)
        remake(prob, tspan=(new_tstart[i],new_tend[i]), p=SVector{2}(prob.p[1], plateu_cycle_study_values[i]))
    end

    plateu_cycle_study_problem = EnsembleProblem(prob, prob_func=prob_func)
    @time sim = solve(plateu_cycle_study_problem, GPUTsit5(), EnsembleGPUKernel(0), trajectories=amount)
end
plateu_cycle_study_gpu()

This code runs on CPU in less than a second with the appropiate changes, the GPU locks up and does nothing.

While this is seperate (and only possible if one removes the tspan part of the prob_func for aboves code), one can also produce a dynamic invocation error by changing these 3 lines to complex types (I would expect that floats and complex can be operated on together without explicitly making everything complex typed):

    f0=complex(0.0f0)
    g0=complex(1.0f0)
    init_cond = SVector{2,Complexf32}(f0, g0)

bug report request from here: https://stackoverflow.com/questions/75742366/dynamic-function-invocation-invalidirerror-with-diffeqgpu-ensemblegpukernel

ChrisRackauckas commented 1 year ago

@utkarsh530 have you looked into this one?

utkarsh530 commented 1 year ago

Yes, I looked into this one: https://stackoverflow.com/a/75794225/15476635 I'll need to investigate further why the adaptive version locks up.