SciML / OrdinaryDiffEq.jl

High performance ordinary differential equation (ODE) and differential-algebraic equation (DAE) solvers, including neural ordinary differential equations (neural ODEs) and scientific machine learning (SciML)
https://diffeq.sciml.ai/latest/
Other
562 stars 211 forks source link

LinearExponential() solver broken for CUDA sparse matrices #2523

Open mcarmesin opened 1 week ago

mcarmesin commented 1 week ago

Describe the bug 🐞

The LinearExponential() fails, if the ODEProblem is based on a MatrixOperator with a CuSparseMatrixCSC.

Expected behavior

The ODE should be successfully solved. Therefor the solver shouldn't compute the full matrix exponential of the given matrix operator.

Minimal Reproducible Example πŸ‘‡

using SparseArrays
using OrdinaryDiffEq
using SciMLOperators
using CUDA 
A = cu(sprand(15,15,0.2))
M = MatrixOperator(A*A')

u0 = cu(ones(15))

prob = ODEProblem(M, u0,(0.,1.0))

solve(prob, LinearExponential())    

Error & Stacktrace ⚠️ The solver computes directly the matrix exponential, which fails for a CuSparseMatrixCSC. See my bug report at Exponentiation utilities.

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex(A::CuArray{Int32, 1, CUDA.DeviceMemory}, I::Int64)
    @ GPUArrays ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50
  [6] getindex(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, i0::Int64, i1::Int64)
    @ CUDA.CUSPARSE ~/.julia/packages/CUDA/2kjXI/lib/cusparse/array.jl:387
  [7] _getindex
    @ ./abstractarray.jl:1341 [inlined]
  [8] getindex
    @ ./abstractarray.jl:1291 [inlined]
  [9] exp_generic_core!(y1::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y2::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y3::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, x::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, ::Val{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:192
 [10] exp_generic!(y1::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y2::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y3::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, x::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, s::Int64, ::Val{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:170
 [11] exp_generic_mutable
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:166 [inlined]
 [12] exponential!(x::CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, method::ExpMethodGeneric{Val{13}()}, cache::Nothing)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:136
 [13] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:797
 [14] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [15] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [16] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [17] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [18] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [19] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [20] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [21] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [22] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [23] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [24] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:13
Some type information was truncated. Use `show(err)` to see complete types.

Environment (please complete the following information):

[47edcb42] ADTypes v1.9.0
  [79e6a3ab] Adapt v4.1.1
  [7d9fca2a] Arpack v0.5.4
  [6e4b80f9] BenchmarkTools v1.5.0
  [052768ef] CUDA v5.5.2
  [071ae1c0] DiffEqGPU v3.4.1
βŒƒ [0c46a032] DifferentialEquations v7.13.0
  [5b8099bc] DomainSets v0.7.14
  [d4d017d3] ExponentialUtilities v1.26.1
βŒƒ [f6369f11] ForwardDiff v0.10.37
βŒ… [46192b85] GPUArraysCore v0.1.6
  [40713840] IncompleteLU v0.2.1
  [7a12625a] LinearMaps v3.11.3
  [7ed4a6bd] LinearSolve v2.36.2
  [94925ecb] MethodOfLines v0.11.6
βŒƒ [961ee093] ModelingToolkit v9.49.0
βŒƒ [1dea7af3] OrdinaryDiffEq v6.80.1
  [94395366] ParallelStencil v0.13.6
  [f0f68f2c] PlotlyJS v0.18.15
  [91a5bcdd] Plots v1.40.8
  [33c8b6b6] ProgressLogging v0.1.4
  [295af30f] Revise v3.6.2
βŒƒ [0bca4576] SciMLBase v2.58.1
  [c0aeaf25] SciMLOperators v0.3.12
  [05bca326] SimpleDiffEq v1.11.1
  [ce78b400] SimpleUnPack v1.1.0
  [9f842d2f] SparseConnectivityTracer v0.6.8
βŒƒ [0c5d862f] Symbolics v6.17.0
  [2f01184e] SparseArrays v1.10.0
Pkg.status(; mode = PKGMODE_MANIFEST)
Status `/lustre/home/ul/ul_student/ul_s_mcarme/CN/Manifest.toml`
  [47edcb42] ADTypes v1.9.0
  [621f4979] AbstractFFTs v1.5.0
  [1520ce14] AbstractTrees v0.4.5
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.1.1
  [66dad0bd] AliasTables v1.1.3
  [ec485272] ArnoldiMethod v0.4.0
  [7d9fca2a] Arpack v0.5.4
βŒƒ [4fba245c] ArrayInterface v7.16.0
  [4c555306] ArrayLayouts v1.10.4
  [bf4720bc] AssetRegistry v0.1.0
  [a9b6321e] Atomix v0.1.0
  [13072b0f] AxisAlgorithms v1.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [aae01518] BandedMatrices v1.7.5
  [6e4b80f9] BenchmarkTools v1.5.0
  [e2ed5e7c] Bijections v0.1.9
  [d1d4a3ce] BitFlags v0.1.9
  [62783981] BitTwiddlingConvenienceFunctions v0.1.6
  [ad839575] Blink v0.12.9
  [8e7c35d0] BlockArrays v1.1.1
βŒƒ [764a87c0] BoundaryValueDiffEq v5.9.1
  [fa961155] CEnum v0.5.0
  [2a0fbf3d] CPUSummary v0.2.6
  [00ebfdb7] CSTParser v3.4.3
  [052768ef] CUDA v5.5.2
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
βŒ… [d35fcfd7] CellArrays v0.2.2
  [d360d2e6] ChainRulesCore v1.25.0
  [fb6a15b2] CloseOpenIntervals v0.1.13
  [da1fd8a2] CodeTracking v1.3.6
  [944b1d66] CodecZlib v0.7.6
βŒƒ [35d6a980] ColorSchemes v3.27.0
βŒ… [3da002f7] ColorTypes v0.11.5
βŒ… [c3611d14] ColorVectorSpace v0.10.0
βŒ… [5ae59095] Colors v0.12.11
  [861a8166] Combinatorics v1.0.2
  [a80b9123] CommonMark v0.8.15
  [38540f10] CommonSolve v0.2.4
  [bbf7d656] CommonSubexpressions v0.3.1
  [f70d9fcc] CommonWorldInvalidations v1.0.0
  [34da2185] Compat v4.16.0
  [b152e2b5] CompositeTypes v0.1.4
  [a33af91c] CompositionsBase v0.1.2
  [2569d6c7] ConcreteStructs v0.2.3
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.8
  [d38c429a] Contour v0.6.3
  [adafc99b] CpuId v0.3.1
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.7.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [bcd4f6db] DelayDiffEq v5.48.1
  [8bb1440f] DelimitedFiles v1.9.1
  [2b5f629d] DiffEqBase v6.158.3
βŒ… [459566f4] DiffEqCallbacks v3.8.0
  [071ae1c0] DiffEqGPU v3.4.1
  [77a26b50] DiffEqNoiseProcess v5.23.0
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
βŒƒ [0c46a032] DifferentialEquations v7.13.0
βŒƒ [a0c0ee7d] DifferentiationInterface v0.6.18
  [8d63f2c5] DispatchDoctor v0.4.17
  [b4f34e82] Distances v0.10.12
βŒƒ [31c24e10] Distributions v0.25.112
  [ffbed154] DocStringExtensions v0.9.3
  [5b8099bc] DomainSets v0.7.14
  [7c1d4256] DynamicPolynomials v0.6.0
βŒƒ [06fc5a27] DynamicQuantities v1.2.0
  [4e289a0a] EnumX v1.0.4
  [f151be2c] EnzymeCore v0.8.5
  [460bff9d] ExceptionUnwrapping v0.1.10
  [d4d017d3] ExponentialUtilities v1.26.1
  [e2ba6199] ExprTools v0.1.10
βŒ… [6b7a57c9] Expronicon v0.8.5
  [c87230d0] FFMPEG v0.4.2
  [9d29842c] FastAlmostBandedMatrices v0.1.4
  [7034ab61] FastBroadcast v0.3.5
  [9aa1b823] FastClosures v0.3.2
  [29a986be] FastLapackInterface v2.0.4
  [1a297f60] FillArrays v1.13.0
  [64ca27bc] FindFirstFunctions v1.4.1
  [6a86dc24] FiniteDiff v2.26.0
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
βŒƒ [f6369f11] ForwardDiff v0.10.37
  [069b7b12] FunctionWrappers v1.1.3
  [77dc65aa] FunctionWrappersWrappers v0.1.3
  [de31a74c] FunctionalCollections v0.5.0
βŒ… [d9f16b24] Functors v0.4.12
βŒ… [0c68f7d7] GPUArrays v10.3.1
βŒ… [46192b85] GPUArraysCore v0.1.6
βŒ… [61eb1bfa] GPUCompiler v0.27.8
  [28b8d3ca] GR v0.73.8
  [c145ed77] GenericSchur v0.5.4
  [c27321d9] Glob v1.3.1
  [86223c79] Graphs v1.12.0
  [42e2da0e] Grisu v1.0.2
βŒƒ [cd3eb016] HTTP v1.10.9
  [9fb69e20] Hiccup v0.2.2
  [3e5b6fbb] HostCPUFeatures v0.1.17
βŒƒ [34004b35] HypergeometricFunctions v0.3.24
  [615f187c] IfElse v0.1.1
  [40713840] IncompleteLU v0.2.1
  [d25df0c9] Inflate v0.1.5
  [842dd82b] InlineStrings v1.4.2
  [18e54dd8] IntegerMathUtils v0.1.2
  [a98d9a8b] Interpolations v0.15.1
  [8197267c] IntervalSets v0.7.10
  [3587e190] InverseFunctions v0.1.17
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.8
  [692b3bcd] JLLWrappers v1.6.1
  [97c1335a] JSExpr v0.5.4
  [682c06a0] JSON v0.21.4
  [98e50ef6] JuliaFormatter v1.0.62
  [aa1ae85d] JuliaInterpreter v0.9.36
  [ccbc3e58] JumpProcesses v9.14.0
  [ef3ab10e] KLU v0.6.0
  [63c18a36] KernelAbstractions v0.9.29
  [ba0b0d4f] Krylov v0.9.8
  [929cbde3] LLVM v9.1.3
  [8b046642] LLVMLoopInfo v1.0.0
  [b964fa9f] LaTeXStrings v1.4.0
  [23fbe1c1] Latexify v0.16.5
  [10f19ff3] LayoutPointers v0.1.17
  [50d2b5c4] Lazy v0.15.1
  [5078a376] LazyArrays v2.2.1
  [2d8b4e74] LevyArea v1.0.0
  [87fe0de2] LineSearch v0.1.4
  [d3d80556] LineSearches v7.3.0
  [7a12625a] LinearMaps v3.11.3
  [7ed4a6bd] LinearSolve v2.36.2
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.1.0
  [bdcacae8] LoopVectorization v0.12.171
  [6f1432cf] LoweredCodeUtils v3.0.5
  [d8e11817] MLStyle v0.4.17
  [1914dd2f] MacroTools v0.5.13
  [d125e4d3] ManualMemory v0.1.8
  [a3b82374] MatrixFactorizations v3.0.1
  [bb5d69b7] MaybeInplace v0.1.4
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [94925ecb] MethodOfLines v0.11.6
  [e1d29d7a] Missings v1.2.0
βŒƒ [961ee093] ModelingToolkit v9.49.0
  [46d2c3a1] MuladdMacro v0.2.4
  [102ac46a] MultivariatePolynomials v0.5.7
  [ffc61752] Mustache v1.0.20
  [d8a4904e] MutableArithmetics v1.5.2
  [a975b10e] Mux v1.0.2
  [d41bc354] NLSolversBase v7.8.3
  [2774e3e8] NLsolve v4.5.1
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
βŒ… [8913a72c] NonlinearSolve v3.15.1
  [510215fc] Observables v0.5.5
  [6fe1bfb0] OffsetArrays v1.14.1
  [4d8831e6] OpenSSL v1.4.3
  [429524aa] Optim v1.9.4
  [bac558e1] OrderedCollections v1.6.3
βŒƒ [1dea7af3] OrdinaryDiffEq v6.80.1
βŒƒ [a7812802] PDEBase v0.1.15
  [90014a1f] PDMats v0.11.31
  [65ce6f38] PackageExtensionCompat v1.0.2
  [94395366] ParallelStencil v0.13.6
  [d96e819e] Parameters v0.12.3
  [69de0a69] Parsers v2.8.1
  [fa939f87] Pidfile v1.3.0
  [b98c9c47] Pipe v1.3.0
  [ccf2f8ad] PlotThemes v3.3.0
βŒƒ [995b91a9] PlotUtils v1.4.2
  [a03496cd] PlotlyBase v0.8.19
  [f0f68f2c] PlotlyJS v0.18.15
  [f2990250] PlotlyKaleido v2.2.5
  [91a5bcdd] Plots v1.40.8
  [e409e4f3] PoissonRandom v0.4.4
  [f517fe37] Polyester v0.7.16
  [1d0040c9] PolyesterWeave v0.2.2
  [2dfb63ee] PooledArrays v1.4.3
  [85a6dd25] PositiveFactorizations v0.2.4
  [d236fae5] PreallocationTools v0.4.24
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.4.0
  [27ebfcd6] Primes v0.5.6
  [33c8b6b6] ProgressLogging v0.1.4
  [43287f4e] PtrArrays v1.2.1
  [1fd47b50] QuadGK v2.11.1
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [c84ed2f1] Ratios v0.4.5
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
βŒƒ [731186ca] RecursiveArrayTools v3.27.2
  [f2c3362d] RecursiveFactorization v0.2.23
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [ae5879a3] ResettableStacks v1.1.1
  [295af30f] Revise v3.6.2
  [79098fc4] Rmath v0.8.0
  [7e49a35a] RuntimeGeneratedFunctions v0.5.13
  [94e857df] SIMDTypes v0.1.0
  [476501e8] SLEEFPirates v0.6.43
βŒƒ [0bca4576] SciMLBase v2.58.1
βŒƒ [19f34311] SciMLJacobianOperators v0.1.0
  [c0aeaf25] SciMLOperators v0.3.12
  [53ae85a6] SciMLStructures v1.5.0
  [6c6a2e73] Scratch v1.2.1
βŒƒ [91c51154] SentinelArrays v1.4.6
  [efcf1570] Setfield v1.1.1
  [992d4aef] Showoff v1.0.3
  [777ac1f9] SimpleBufferStream v1.2.0
  [05bca326] SimpleDiffEq v1.11.1
βŒ… [727e6d20] SimpleNonlinearSolve v1.12.3
  [699a6c99] SimpleTraits v0.9.4
  [ce78b400] SimpleUnPack v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [9f842d2f] SparseConnectivityTracer v0.6.8
  [47a9eef4] SparseDiffTools v2.23.0
βŒƒ [0a514795] SparseMatrixColorings v0.4.8
  [e56a9233] Sparspak v0.3.9
  [276daf66] SpecialFunctions v2.4.0
  [860ef19b] StableRNGs v1.0.2
  [aedffcd0] Static v1.1.1
  [0d7ed370] StaticArrayInterface v1.8.0
  [90137ffa] StaticArrays v1.9.8
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.2
  [9672c7b4] SteadyStateDiffEq v2.4.1
βŒƒ [789caeaf] StochasticDiffEq v6.65.1
  [7792a7ef] StrideArraysCore v0.5.7
  [892a3eda] StringManipulation v0.4.0
βŒƒ [c3572dad] Sundials v4.26.0
  [2efcf032] SymbolicIndexingInterface v0.3.34
  [19f23fe9] SymbolicLimits v0.2.2
  [d1185830] SymbolicUtils v3.7.2
βŒƒ [0c5d862f] Symbolics v6.17.0
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [8ea1fca8] TermInterface v2.0.0
  [1c621080] TestItems v1.0.0
  [8290d209] ThreadingUtilities v0.5.2
  [a759f4b9] TimerOutputs v0.5.25
  [0796e94c] Tokenize v0.5.29
  [3bb67fe8] TranscodingStreams v0.11.3
  [d5829a12] TriangularSolve v0.2.1
  [410a4b4d] Tricks v0.1.9
  [781d530d] TruncatedStacktraces v1.4.0
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [a7c27f48] Unityper v0.1.6
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [41fe7b60] Unzip v0.2.0
βŒƒ [3d5dd08c] VectorizationBase v0.21.70
  [19fa3120] VertexSafeGraphs v0.2.0
  [0f1e0344] WebIO v0.8.21
  [104b5d7c] WebSockets v1.6.0
  [cc8bc4a8] Widgets v0.6.6
  [efce3f68] WoodburyMatrices v1.0.0
  [700de1a5] ZygoteRules v0.2.5
βŒ… [68821587] Arpack_jll v3.5.1+1
  [6e34b625] Bzip2_jll v1.0.8+2
  [4ee394cb] CUDA_Driver_jll v0.10.3+0
  [76a88914] CUDA_Runtime_jll v0.15.3+0
  [83423d85] Cairo_jll v1.18.2+1
  [ee1fde0b] Dbus_jll v1.14.10+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
βŒ… [b22a6f82] FFMPEG_jll v4.4.4+1
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+1
  [d2c73de3] GR_jll v0.73.8+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.5+0
  [3b182d85] Graphite2_jll v1.3.14+0
  [2e76f6c2] HarfBuzz_jll v8.3.1+0
  [1d5cc7b8] IntelOpenMP_jll v2024.2.1+0
  [aacddb02] JpegTurbo_jll v3.0.4+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [f7e6163d] Kaleido_jll v0.2.1+0
  [c1c5ebd0] LAME_jll v3.100.2+0
  [88015f11] LERC_jll v4.0.0+0
  [dad2f222] LLVMExtra_jll v0.0.34+0
  [1d63c593] LLVMOpenMP_jll v18.1.7+0
  [dd4b983a] LZO_jll v2.10.2+1
βŒ… [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.11.0+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.50.0+0
  [94ce4f54] Libiconv_jll v1.17.0+1
  [4b2f31a3] Libmount_jll v2.40.1+0
  [89763e89] Libtiff_jll v4.7.0+0
  [38a345b3] Libuuid_jll v2.40.1+0
  [856f044c] MKL_jll v2024.2.0+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [e7412a2a] Ogg_jll v1.3.5+1
  [458c3c95] OpenSSL_jll v3.0.15+1
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.3+0
  [36c8627f] Pango_jll v1.54.1+0
  [30392449] Pixman_jll v0.43.4+0
  [c0090381] Qt6Base_jll v6.7.1+1
  [629bc702] Qt6Declarative_jll v6.7.1+2
  [ce943373] Qt6ShaderTools_jll v6.7.1+1
  [e99dba38] Qt6Wayland_jll v6.7.1+1
  [f50d1b31] Rmath_jll v0.5.1+0
βŒ… [fb77eaff] Sundials_jll v5.2.2+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.4+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.6.3+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+1
  [1e29f10c] demumble_jll v1.3.0+0
  [35ca27e7] eudev_jll v3.2.9+0
  [214eeab7] fzf_jll v0.53.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.2+0
  [1183f4f0] libdecor_jll v0.2.2+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.3+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.44+0
  [f27f6e37] libvorbis_jll v1.3.7+2
  [009596ad] mtdev_jll v1.1.6+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
βŒ… [1270edf5] x264_jll v2021.5.5+0
βŒ… [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [9abbd945] Profile
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [1a1011a3] SharedArrays
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.11.0+0
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2
Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 96 Γ— Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
Threads: 48 default, 0 interactive, 24 GC (on 96 virtual cores)
ChrisRackauckas commented 1 week ago

Therefor the solver shouldn't compute the full matrix exponential of the given matrix operator.

Did you set krylov=true?

mcarmesin commented 1 week ago

solve(prob, LinearExponential(krylov=:simple)) fails with a different error:

ERROR: This object is not a GPU array
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] backend(::Type)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:225
  [3] backend(::Type{SubArray{Float32, 1, Matrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}})
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:229
  [4] backend(x::SubArray{Float32, 1, Matrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:226
  [5] _copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:78 [inlined]
  [6] materialize!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:38 [inlined]
  [7] materialize!
    @ ./broadcast.jl:911 [inlined]
  [8] firststep!(Ks::KrylovSubspace{…}, V::SubArray{…}, H::SubArray{…}, b::CuArray{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:171
  [9] lanczos!(Ks::KrylovSubspace{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, b::CuArray{…}; tol::Float64, m::Int64, opnorm::Nothing, init::Int64, t::Float64, mu::Float64, l::Int64)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:331
 [10] lanczos!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:318 [inlined]
 [11] arnoldi!(Ks::KrylovSubspace{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, b::CuArray{…}; tol::Float64, m::Int64, ishermitian::Bool, opnorm::Function, iop::Int64, init::Int64, t::Float64, mu::Float64, l::Int64)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:246
 [12] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:801
 [13] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [14] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [15] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [16] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [17] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [18] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [19] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [20] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [21] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [22] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [23] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:12

while solve(prob, LinearExponential(krylov=:adaptive)) yields again

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
  [6] copyto_unaliased!(deststyle::IndexLinear, dest::SubArray{…}, srcstyle::IndexLinear, src::CuArray{…})
    @ Base ./abstractarray.jl:1088
  [7] copyto!
    @ ./abstractarray.jl:1068 [inlined]
  [8] phiv_timestep!(U::CuArray{…}, ts::Vector{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, B::CuArray{…}; tau::Float64, m::Int64, tol::Float32, opnorm::typeof(LinearAlgebra.opnorm), iop::Int64, correct::Bool, caches::Tuple{…}, adaptive::Bool, delta::Float64, ishermitian::Bool, gamma::Float64, NA::Int64, verbose::Bool)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:170
  [9] #expv_timestep!#39
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:54 [inlined]
 [10] expv_timestep!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:51 [inlined]
 [11] #expv_timestep!#38
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:48 [inlined]
 [12] expv_timestep!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:46 [inlined]
 [13] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:805
 [14] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [15] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [16] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [17] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [18] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [19] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [20] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [21] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [22] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [23] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [24] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:12
Some type information was truncated. Use `show(err)` to see complete types.
ChrisRackauckas commented 1 week ago

Okay so it looks like the root of the issue is that phiv! needs to support this better.

ChrisRackauckas commented 1 week ago

Let's narrow this down to something that's just ExponentialUtilties.jl? There are some basic expv tests:

https://github.com/SciML/ExponentialUtilities.jl/blob/master/test/gpu/gputests.jl#L40-L57

so something must be missed by the tests.

mcarmesin commented 5 days ago

At least for krylov=:simple, the problem is that the V component of the KrylovSubspace is not constructed on the GPU. We have to alter the function alg_cache(alg::LinearExponential, …) in order to take care of constructing the KrylovSubspace in the right way, see my PR [https://github.com/SciML/OrdinaryDiffEq.jl/pull/2538]().