SciML / DiffEqGPU.jl

GPU-acceleration routines for DifferentialEquations.jl and the broader SciML scientific machine learning ecosystem
https://docs.sciml.ai/DiffEqGPU/stable/
MIT License
279 stars 29 forks source link

Ensemble simulation v1.18 wrong solution for adaptive=false & save_everystep=false #177

Closed TheStarAlight closed 1 year ago

TheStarAlight commented 2 years ago

Hi, I just have another problem in v1.18's ensemble simulation. When setting adaptive=false & save_everystep=false, solver would return some strange solutions, just like below (I set identical time span for all probs (0.,10.)):

julia> sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(0.), trajectories=trajNum, adaptive=false, dt=0.1, save_everystep=false)
EnsembleSolution Solution of length 10000 with uType:
ODESolution{Float64, 2, SubArray{SVector{3, Float64}, 1, Matrix{SVector{3, Float64}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Nothing, Nothing, SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUTsit5, SciMLBase.LinearInterpolation{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, SubArray{SVector{3, Float64}, 1, Matrix{SVector{3, Float64}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}, Nothing}

julia> sol.u[1]
retcode: Default
Interpolation: 1st order linear
t: 2-element view(::Matrix{Float64}, :, 1) with eltype Float64:
 10.0
  0.1
u: 2-element view(::Matrix{SVector{3, Float64}}, :, 1) with eltype SVector{3, Float64}:
 [0.0, 10.0, 10.0]
 [1.3065276180109848, 2.5117399641211047, 0.09985596250326349]

julia> sol.u[2]
retcode: Default
Interpolation: 1st order linear
t: 2-element view(::Matrix{Float64}, :, 2) with eltype Float64:
 2.5117399641211047
 0.1
u: 2-element view(::Matrix{SVector{3, Float64}}, :, 2) with eltype SVector{3, Float64}:
 [1.0, 0.0, 0.0]
 [1.3065276180109848, 2.5117399641211047, 0.09985596250326349]

julia> sol.u[3]
retcode: Default
Interpolation: 1st order linear
t: 2-element view(::Matrix{Float64}, :, 3) with eltype Float64:
 1.0
 0.1
u: 2-element view(::Matrix{SVector{3, Float64}}, :, 3) with eltype SVector{3, Float64}:
 [28.0, 2.6666666666666665, 3.556645e-318]
 [1.3065276180109848, 2.5117399641211047, 0.09985596250326349]

the output should contain initial state and final state (i.e., solutions at times 0.0 and 10.0), however it just gives such strange output. By the way, I even found the weirdest solution I've ever seen, just hard to explain how it can be produced:

julia> sol.u[4]
retcode: Default
Interpolation: 1st order linear
t: 2-element view(::Matrix{Float64}, :, 4) with eltype Float64:
 -1504.6714850326832
    10.0
u: 2-element view(::Matrix{SVector{3, Float64}}, :, 4) with eltype SVector{3, Float64}:
 [2.0211921684857396e16, -1.935180723694679e16, 6.308394240986915e15]
 [-5.691759766696827, -5.432312220413224, 24.120673587059013]

What's more, even for adaptive=false without assigning save_everystep this error happens [[sometimes]]:

julia> @profile sol = solve(monteprob, GPUTsit5(), EnsembleGPUKernel(0.), trajectories=trajNum, adaptive=false, dt=0.1)
EnsembleSolution Solution of length 10000 with uType:
ODESolution{Float64, 2, SubArray{SVector{3, Float64}, 1, Matrix{SVector{3, Float64}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Nothing, Nothing, SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, Nothing, ODEProblem{SVector{3, Float64}, Tuple{Float64, Float64}, false, SVector{3, Float64}, ODEFunction{false, typeof(lorenz), LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, GPUTsit5, SciMLBase.LinearInterpolation{SubArray{Float64, 1, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}, SubArray{SVector{3, Float64}, 1, Matrix{SVector{3, Float64}}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}}, Nothing}

julia> sol.u[1]
retcode: Default
Interpolation: 1st order linear
t: 101-element view(::Matrix{Float64}, :, 1) with eltype Float64:
 -45.80512921554731
   0.1
   0.2
   0.30000000000000004
   0.4
   0.5
   0.6
   0.7
   0.7999999999999999
   0.8999999999999999
   0.9999999999999999
   1.0999999999999999
   1.2
   1.3
   1.4000000000000001
   1.5000000000000002
   1.6000000000000003
   1.7000000000000004
   1.8000000000000005
   1.9000000000000006
   2.0000000000000004
   2.1000000000000005
   2.2000000000000006
   2.3000000000000007
   ⋮
   7.699999999999989
   7.799999999999988
   7.899999999999988
   7.999999999999988
   8.099999999999987
   8.199999999999987
   8.299999999999986
   8.399999999999986
   8.499999999999986
   8.599999999999985
   8.699999999999985
   8.799999999999985
   8.899999999999984
   8.999999999999984
   9.099999999999984
   9.199999999999983
   9.299999999999983
   9.399999999999983
   9.499999999999982
   9.599999999999982
   9.699999999999982
   9.799999999999981
   9.89999999999998
   9.99999999999998
u: 101-element view(::Matrix{SVector{3, Float64}}, :, 1) with eltype SVector{3, Float64}:
 [2.5117399641211047, 0.09985596250326349, 3.556645e-318]
 [1.3065276180109848, 2.5117399641211047, 0.09985596250326349]
 [3.9277308322134736, 8.436378057976949, 1.2596608575502182]
 [11.719467420742916, 23.060630974948815, 11.765555028930041]
 [19.8843421330782, 16.717058253417502, 45.745446838231494]
 [8.055831413611555, -7.4796010827414445, 35.63895832609782]
 [-2.1322771866394286, -7.927149020750271, 26.004663871096177]
 [-5.980220288702999, -8.827667063748212, 23.09239801180751]
 [-8.656915699428795, -11.22221007103295, 24.277895519432743]
 [-10.621367967764579, -11.563023935984662, 28.723359780988787]
 [-10.0369432711367, -8.056914002880497, 31.215532023048763]
 [-7.615204405030579, -5.30004461144178, 28.869039383497128]
 [-6.074138909885838, -5.4229998505608785, 25.13472773154352]
 [-6.313489095772495, -7.3951888486058515, 22.620466946040654]
 [-8.088516305998604, -10.415869756848132, 22.96292686281406]
 [-10.384196214852349, -12.169509025266597, 27.12979191922242]
 [-10.71641794865665, -9.399022294271472, 31.30544311868878]
 [-8.361846636307858, -5.5912782884652765, 30.128516807412865]
 [-6.194266388454561, -4.880310070678672, 26.18583456787668]
 [-5.844879011578812, -6.423355795628572, 22.903005986739835]
 [-7.241805313650549, -9.375727125180092, 22.027009016611746]
 [-9.745884453013836, -12.249409219862414, 25.16755350449017]
 [-11.168828373156988, -10.99853269864599, 30.623379506301994]
 [-9.350401859154674, -6.394252538754574, 31.39060824374207]
 ⋮
 [-9.60705605985231, -0.8674171555483453, 36.45591313650752]
 [-2.7524912495716842, 1.6491107122244468, 27.502262252028004]
 [-0.15924292133816034, 1.172712047920806, 20.91390176636092]
 [0.7149864846359443, 1.408812368735083, 16.057518245699676]
 [1.5755176922474494, 2.766559446642311, 12.504578803357445]
 [3.473283643758078, 6.343830274337464, 10.557232224633223]
 [7.970102402073097, 14.364333152253126, 13.247092261576901]
 [15.110528743866308, 20.55360789631085, 29.91963267126595]
 [13.662794786175859, 5.317906902793097, 40.753393714670786]
 [4.535952515995951, -2.417297667427647, 30.96489276664308]
 [0.15076589019660916, -2.2300594323230807, 23.30949455297306]
 [-1.4131195487786126, -2.617035178210637, 18.0122989774559]
 [-2.850342022330117, -4.779613589700929, 14.476883687823811]
 [-5.7707180012101364, -9.954303777233548, 13.830658519036021]
 [-11.351242515277047, -17.7623391546475, 21.42730836873687]
 [-14.900393478376847, -13.233308080303212, 37.21279481056003]
 [-8.682447237965514, -0.2662593961452952, 35.207869321475485]
 [-2.427118956833078, 1.4134861896740993, 26.554367518986062]
 [-0.21214681160444415, 0.8946872798239076, 20.22624551109806]
 [0.4916248350871866, 1.0276018480579419, 15.50860913138517]
 [1.1410080713254922, 2.033810537065679, 11.986115745405414]
 [2.575101187490827, 4.7689228987997065, 9.71717506634322]
 [6.152913728341651, 11.506398285558515, 10.490818329320232]
 [13.378320532370639, 21.314700967894645, 22.974030342436826]

Could it be the problem of my laptop's GPU? But my GPU hasn't gone through any "hard labor" since I bought the laptop... I will try the program on other people's workstation and report if I have some discovery.

ChrisRackauckas commented 1 year ago

This was a pre-release version of EnsembleGPUKernel. It should all be cleaned up now.