LinearExponential() solver broken for CUDA sparse matrices

mcarmesin commented 1 week ago

Describe the bug 🐞

The LinearExponential() fails, if the ODEProblem is based on a MatrixOperator with a CuSparseMatrixCSC.

Expected behavior

The ODE should be successfully solved. Therefor the solver shouldn't compute the full matrix exponential of the given matrix operator.

Minimal Reproducible Example 👇

using SparseArrays
using OrdinaryDiffEq
using SciMLOperators
using CUDA 
A = cu(sprand(15,15,0.2))
M = MatrixOperator(A*A')

u0 = cu(ones(15))

prob = ODEProblem(M, u0,(0.,1.0))

solve(prob, LinearExponential())

Error & Stacktrace ⚠️ The solver computes directly the matrix exponential, which fails for a CuSparseMatrixCSC. See my bug report at Exponentiation utilities.

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex(A::CuArray{Int32, 1, CUDA.DeviceMemory}, I::Int64)
    @ GPUArrays ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50
  [6] getindex(A::CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, i0::Int64, i1::Int64)
    @ CUDA.CUSPARSE ~/.julia/packages/CUDA/2kjXI/lib/cusparse/array.jl:387
  [7] _getindex
    @ ./abstractarray.jl:1341 [inlined]
  [8] getindex
    @ ./abstractarray.jl:1291 [inlined]
  [9] exp_generic_core!(y1::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y2::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y3::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, x::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, ::Val{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:192
 [10] exp_generic!(y1::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y2::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, y3::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, x::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, s::Int64, ::Val{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:170
 [11] exp_generic_mutable
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:166 [inlined]
 [12] exponential!(x::CUDA.CUSPARSE.CuSparseMatrixCSC{Float64, Int32}, method::ExpMethodGeneric{Val{13}()}, cache::Nothing)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/exp_generic.jl:136
 [13] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:797
 [14] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [15] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [16] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [17] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [18] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [19] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [20] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [21] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [22] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [23] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [24] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:13
Some type information was truncated. Use `show(err)` to see complete types.

Environment (please complete the following information):

Output of using Pkg; Pkg.status()

[47edcb42] ADTypes v1.9.0
  [79e6a3ab] Adapt v4.1.1
  [7d9fca2a] Arpack v0.5.4
  [6e4b80f9] BenchmarkTools v1.5.0
  [052768ef] CUDA v5.5.2
  [071ae1c0] DiffEqGPU v3.4.1
⌃ [0c46a032] DifferentialEquations v7.13.0
  [5b8099bc] DomainSets v0.7.14
  [d4d017d3] ExponentialUtilities v1.26.1
⌃ [f6369f11] ForwardDiff v0.10.37
⌅ [46192b85] GPUArraysCore v0.1.6
  [40713840] IncompleteLU v0.2.1
  [7a12625a] LinearMaps v3.11.3
  [7ed4a6bd] LinearSolve v2.36.2
  [94925ecb] MethodOfLines v0.11.6
⌃ [961ee093] ModelingToolkit v9.49.0
⌃ [1dea7af3] OrdinaryDiffEq v6.80.1
  [94395366] ParallelStencil v0.13.6
  [f0f68f2c] PlotlyJS v0.18.15
  [91a5bcdd] Plots v1.40.8
  [33c8b6b6] ProgressLogging v0.1.4
  [295af30f] Revise v3.6.2
⌃ [0bca4576] SciMLBase v2.58.1
  [c0aeaf25] SciMLOperators v0.3.12
  [05bca326] SimpleDiffEq v1.11.1
  [ce78b400] SimpleUnPack v1.1.0
  [9f842d2f] SparseConnectivityTracer v0.6.8
⌃ [0c5d862f] Symbolics v6.17.0
  [2f01184e] SparseArrays v1.10.0

Output of using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)

Pkg.status(; mode = PKGMODE_MANIFEST)
Status `/lustre/home/ul/ul_student/ul_s_mcarme/CN/Manifest.toml`
  [47edcb42] ADTypes v1.9.0
  [621f4979] AbstractFFTs v1.5.0
  [1520ce14] AbstractTrees v0.4.5
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.1.1
  [66dad0bd] AliasTables v1.1.3
  [ec485272] ArnoldiMethod v0.4.0
  [7d9fca2a] Arpack v0.5.4
⌃ [4fba245c] ArrayInterface v7.16.0
  [4c555306] ArrayLayouts v1.10.4
  [bf4720bc] AssetRegistry v0.1.0
  [a9b6321e] Atomix v0.1.0
  [13072b0f] AxisAlgorithms v1.1.0
  [ab4f0b2a] BFloat16s v0.5.0
  [aae01518] BandedMatrices v1.7.5
  [6e4b80f9] BenchmarkTools v1.5.0
  [e2ed5e7c] Bijections v0.1.9
  [d1d4a3ce] BitFlags v0.1.9
  [62783981] BitTwiddlingConvenienceFunctions v0.1.6
  [ad839575] Blink v0.12.9
  [8e7c35d0] BlockArrays v1.1.1
⌃ [764a87c0] BoundaryValueDiffEq v5.9.1
  [fa961155] CEnum v0.5.0
  [2a0fbf3d] CPUSummary v0.2.6
  [00ebfdb7] CSTParser v3.4.3
  [052768ef] CUDA v5.5.2
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
⌅ [d35fcfd7] CellArrays v0.2.2
  [d360d2e6] ChainRulesCore v1.25.0
  [fb6a15b2] CloseOpenIntervals v0.1.13
  [da1fd8a2] CodeTracking v1.3.6
  [944b1d66] CodecZlib v0.7.6
⌃ [35d6a980] ColorSchemes v3.27.0
⌅ [3da002f7] ColorTypes v0.11.5
⌅ [c3611d14] ColorVectorSpace v0.10.0
⌅ [5ae59095] Colors v0.12.11
  [861a8166] Combinatorics v1.0.2
  [a80b9123] CommonMark v0.8.15
  [38540f10] CommonSolve v0.2.4
  [bbf7d656] CommonSubexpressions v0.3.1
  [f70d9fcc] CommonWorldInvalidations v1.0.0
  [34da2185] Compat v4.16.0
  [b152e2b5] CompositeTypes v0.1.4
  [a33af91c] CompositionsBase v0.1.2
  [2569d6c7] ConcreteStructs v0.2.3
  [f0e56b4a] ConcurrentUtilities v2.4.2
  [187b0558] ConstructionBase v1.5.8
  [d38c429a] Contour v0.6.3
  [adafc99b] CpuId v0.3.1
  [a8cc5b0e] Crayons v4.1.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.7.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [bcd4f6db] DelayDiffEq v5.48.1
  [8bb1440f] DelimitedFiles v1.9.1
  [2b5f629d] DiffEqBase v6.158.3
⌅ [459566f4] DiffEqCallbacks v3.8.0
  [071ae1c0] DiffEqGPU v3.4.1
  [77a26b50] DiffEqNoiseProcess v5.23.0
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
⌃ [0c46a032] DifferentialEquations v7.13.0
⌃ [a0c0ee7d] DifferentiationInterface v0.6.18
  [8d63f2c5] DispatchDoctor v0.4.17
  [b4f34e82] Distances v0.10.12
⌃ [31c24e10] Distributions v0.25.112
  [ffbed154] DocStringExtensions v0.9.3
  [5b8099bc] DomainSets v0.7.14
  [7c1d4256] DynamicPolynomials v0.6.0
⌃ [06fc5a27] DynamicQuantities v1.2.0
  [4e289a0a] EnumX v1.0.4
  [f151be2c] EnzymeCore v0.8.5
  [460bff9d] ExceptionUnwrapping v0.1.10
  [d4d017d3] ExponentialUtilities v1.26.1
  [e2ba6199] ExprTools v0.1.10
⌅ [6b7a57c9] Expronicon v0.8.5
  [c87230d0] FFMPEG v0.4.2
  [9d29842c] FastAlmostBandedMatrices v0.1.4
  [7034ab61] FastBroadcast v0.3.5
  [9aa1b823] FastClosures v0.3.2
  [29a986be] FastLapackInterface v2.0.4
  [1a297f60] FillArrays v1.13.0
  [64ca27bc] FindFirstFunctions v1.4.1
  [6a86dc24] FiniteDiff v2.26.0
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
⌃ [f6369f11] ForwardDiff v0.10.37
  [069b7b12] FunctionWrappers v1.1.3
  [77dc65aa] FunctionWrappersWrappers v0.1.3
  [de31a74c] FunctionalCollections v0.5.0
⌅ [d9f16b24] Functors v0.4.12
⌅ [0c68f7d7] GPUArrays v10.3.1
⌅ [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.27.8
  [28b8d3ca] GR v0.73.8
  [c145ed77] GenericSchur v0.5.4
  [c27321d9] Glob v1.3.1
  [86223c79] Graphs v1.12.0
  [42e2da0e] Grisu v1.0.2
⌃ [cd3eb016] HTTP v1.10.9
  [9fb69e20] Hiccup v0.2.2
  [3e5b6fbb] HostCPUFeatures v0.1.17
⌃ [34004b35] HypergeometricFunctions v0.3.24
  [615f187c] IfElse v0.1.1
  [40713840] IncompleteLU v0.2.1
  [d25df0c9] Inflate v0.1.5
  [842dd82b] InlineStrings v1.4.2
  [18e54dd8] IntegerMathUtils v0.1.2
  [a98d9a8b] Interpolations v0.15.1
  [8197267c] IntervalSets v0.7.10
  [3587e190] InverseFunctions v0.1.17
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [82899510] IteratorInterfaceExtensions v1.0.0
  [1019f520] JLFzf v0.1.8
  [692b3bcd] JLLWrappers v1.6.1
  [97c1335a] JSExpr v0.5.4
  [682c06a0] JSON v0.21.4
  [98e50ef6] JuliaFormatter v1.0.62
  [aa1ae85d] JuliaInterpreter v0.9.36
  [ccbc3e58] JumpProcesses v9.14.0
  [ef3ab10e] KLU v0.6.0
  [63c18a36] KernelAbstractions v0.9.29
  [ba0b0d4f] Krylov v0.9.8
  [929cbde3] LLVM v9.1.3
  [8b046642] LLVMLoopInfo v1.0.0
  [b964fa9f] LaTeXStrings v1.4.0
  [23fbe1c1] Latexify v0.16.5
  [10f19ff3] LayoutPointers v0.1.17
  [50d2b5c4] Lazy v0.15.1
  [5078a376] LazyArrays v2.2.1
  [2d8b4e74] LevyArea v1.0.0
  [87fe0de2] LineSearch v0.1.4
  [d3d80556] LineSearches v7.3.0
  [7a12625a] LinearMaps v3.11.3
  [7ed4a6bd] LinearSolve v2.36.2
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.1.0
  [bdcacae8] LoopVectorization v0.12.171
  [6f1432cf] LoweredCodeUtils v3.0.5
  [d8e11817] MLStyle v0.4.17
  [1914dd2f] MacroTools v0.5.13
  [d125e4d3] ManualMemory v0.1.8
  [a3b82374] MatrixFactorizations v3.0.1
  [bb5d69b7] MaybeInplace v0.1.4
  [739be429] MbedTLS v1.1.9
  [442fdcdd] Measures v0.3.2
  [94925ecb] MethodOfLines v0.11.6
  [e1d29d7a] Missings v1.2.0
⌃ [961ee093] ModelingToolkit v9.49.0
  [46d2c3a1] MuladdMacro v0.2.4
  [102ac46a] MultivariatePolynomials v0.5.7
  [ffc61752] Mustache v1.0.20
  [d8a4904e] MutableArithmetics v1.5.2
  [a975b10e] Mux v1.0.2
  [d41bc354] NLSolversBase v7.8.3
  [2774e3e8] NLsolve v4.5.1
  [5da4648a] NVTX v0.3.4
  [77ba4419] NaNMath v1.0.2
⌅ [8913a72c] NonlinearSolve v3.15.1
  [510215fc] Observables v0.5.5
  [6fe1bfb0] OffsetArrays v1.14.1
  [4d8831e6] OpenSSL v1.4.3
  [429524aa] Optim v1.9.4
  [bac558e1] OrderedCollections v1.6.3
⌃ [1dea7af3] OrdinaryDiffEq v6.80.1
⌃ [a7812802] PDEBase v0.1.15
  [90014a1f] PDMats v0.11.31
  [65ce6f38] PackageExtensionCompat v1.0.2
  [94395366] ParallelStencil v0.13.6
  [d96e819e] Parameters v0.12.3
  [69de0a69] Parsers v2.8.1
  [fa939f87] Pidfile v1.3.0
  [b98c9c47] Pipe v1.3.0
  [ccf2f8ad] PlotThemes v3.3.0
⌃ [995b91a9] PlotUtils v1.4.2
  [a03496cd] PlotlyBase v0.8.19
  [f0f68f2c] PlotlyJS v0.18.15
  [f2990250] PlotlyKaleido v2.2.5
  [91a5bcdd] Plots v1.40.8
  [e409e4f3] PoissonRandom v0.4.4
  [f517fe37] Polyester v0.7.16
  [1d0040c9] PolyesterWeave v0.2.2
  [2dfb63ee] PooledArrays v1.4.3
  [85a6dd25] PositiveFactorizations v0.2.4
  [d236fae5] PreallocationTools v0.4.24
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.4.0
  [27ebfcd6] Primes v0.5.6
  [33c8b6b6] ProgressLogging v0.1.4
  [43287f4e] PtrArrays v1.2.1
  [1fd47b50] QuadGK v2.11.1
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [c84ed2f1] Ratios v0.4.5
  [3cdcf5f2] RecipesBase v1.3.4
  [01d81517] RecipesPipeline v0.6.12
⌃ [731186ca] RecursiveArrayTools v3.27.2
  [f2c3362d] RecursiveFactorization v0.2.23
  [189a3867] Reexport v1.2.2
  [05181044] RelocatableFolders v1.0.1
  [ae029012] Requires v1.3.0
  [ae5879a3] ResettableStacks v1.1.1
  [295af30f] Revise v3.6.2
  [79098fc4] Rmath v0.8.0
  [7e49a35a] RuntimeGeneratedFunctions v0.5.13
  [94e857df] SIMDTypes v0.1.0
  [476501e8] SLEEFPirates v0.6.43
⌃ [0bca4576] SciMLBase v2.58.1
⌃ [19f34311] SciMLJacobianOperators v0.1.0
  [c0aeaf25] SciMLOperators v0.3.12
  [53ae85a6] SciMLStructures v1.5.0
  [6c6a2e73] Scratch v1.2.1
⌃ [91c51154] SentinelArrays v1.4.6
  [efcf1570] Setfield v1.1.1
  [992d4aef] Showoff v1.0.3
  [777ac1f9] SimpleBufferStream v1.2.0
  [05bca326] SimpleDiffEq v1.11.1
⌅ [727e6d20] SimpleNonlinearSolve v1.12.3
  [699a6c99] SimpleTraits v0.9.4
  [ce78b400] SimpleUnPack v1.1.0
  [a2af1166] SortingAlgorithms v1.2.1
  [9f842d2f] SparseConnectivityTracer v0.6.8
  [47a9eef4] SparseDiffTools v2.23.0
⌃ [0a514795] SparseMatrixColorings v0.4.8
  [e56a9233] Sparspak v0.3.9
  [276daf66] SpecialFunctions v2.4.0
  [860ef19b] StableRNGs v1.0.2
  [aedffcd0] Static v1.1.1
  [0d7ed370] StaticArrayInterface v1.8.0
  [90137ffa] StaticArrays v1.9.8
  [1e83bf80] StaticArraysCore v1.4.3
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.2
  [9672c7b4] SteadyStateDiffEq v2.4.1
⌃ [789caeaf] StochasticDiffEq v6.65.1
  [7792a7ef] StrideArraysCore v0.5.7
  [892a3eda] StringManipulation v0.4.0
⌃ [c3572dad] Sundials v4.26.0
  [2efcf032] SymbolicIndexingInterface v0.3.34
  [19f23fe9] SymbolicLimits v0.2.2
  [d1185830] SymbolicUtils v3.7.2
⌃ [0c5d862f] Symbolics v6.17.0
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [62fd8b95] TensorCore v0.1.1
  [8ea1fca8] TermInterface v2.0.0
  [1c621080] TestItems v1.0.0
  [8290d209] ThreadingUtilities v0.5.2
  [a759f4b9] TimerOutputs v0.5.25
  [0796e94c] Tokenize v0.5.29
  [3bb67fe8] TranscodingStreams v0.11.3
  [d5829a12] TriangularSolve v0.2.1
  [410a4b4d] Tricks v0.1.9
  [781d530d] TruncatedStacktraces v1.4.0
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1cfade01] UnicodeFun v0.4.1
  [1986cc42] Unitful v1.21.0
  [45397f5d] UnitfulLatexify v1.6.4
  [a7c27f48] Unityper v0.1.6
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [41fe7b60] Unzip v0.2.0
⌃ [3d5dd08c] VectorizationBase v0.21.70
  [19fa3120] VertexSafeGraphs v0.2.0
  [0f1e0344] WebIO v0.8.21
  [104b5d7c] WebSockets v1.6.0
  [cc8bc4a8] Widgets v0.6.6
  [efce3f68] WoodburyMatrices v1.0.0
  [700de1a5] ZygoteRules v0.2.5
⌅ [68821587] Arpack_jll v3.5.1+1
  [6e34b625] Bzip2_jll v1.0.8+2
  [4ee394cb] CUDA_Driver_jll v0.10.3+0
  [76a88914] CUDA_Runtime_jll v0.15.3+0
  [83423d85] Cairo_jll v1.18.2+1
  [ee1fde0b] Dbus_jll v1.14.10+0
  [2702e6a9] EpollShim_jll v0.0.20230411+0
  [2e619515] Expat_jll v2.6.2+0
⌅ [b22a6f82] FFMPEG_jll v4.4.4+1
  [a3f928ae] Fontconfig_jll v2.13.96+0
  [d7e528f0] FreeType2_jll v2.13.2+0
  [559328eb] FriBidi_jll v1.0.14+0
  [0656b61e] GLFW_jll v3.4.0+1
  [d2c73de3] GR_jll v0.73.8+0
  [78b55507] Gettext_jll v0.21.0+0
  [7746bdde] Glib_jll v2.80.5+0
  [3b182d85] Graphite2_jll v1.3.14+0
  [2e76f6c2] HarfBuzz_jll v8.3.1+0
  [1d5cc7b8] IntelOpenMP_jll v2024.2.1+0
  [aacddb02] JpegTurbo_jll v3.0.4+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [f7e6163d] Kaleido_jll v0.2.1+0
  [c1c5ebd0] LAME_jll v3.100.2+0
  [88015f11] LERC_jll v4.0.0+0
  [dad2f222] LLVMExtra_jll v0.0.34+0
  [1d63c593] LLVMOpenMP_jll v18.1.7+0
  [dd4b983a] LZO_jll v2.10.2+1
⌅ [e9f186c6] Libffi_jll v3.2.2+1
  [d4300ac3] Libgcrypt_jll v1.11.0+0
  [7e76a0d4] Libglvnd_jll v1.6.0+0
  [7add5ba3] Libgpg_error_jll v1.50.0+0
  [94ce4f54] Libiconv_jll v1.17.0+1
  [4b2f31a3] Libmount_jll v2.40.1+0
  [89763e89] Libtiff_jll v4.7.0+0
  [38a345b3] Libuuid_jll v2.40.1+0
  [856f044c] MKL_jll v2024.2.0+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [e7412a2a] Ogg_jll v1.3.5+1
  [458c3c95] OpenSSL_jll v3.0.15+1
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [91d4177d] Opus_jll v1.3.3+0
  [36c8627f] Pango_jll v1.54.1+0
  [30392449] Pixman_jll v0.43.4+0
  [c0090381] Qt6Base_jll v6.7.1+1
  [629bc702] Qt6Declarative_jll v6.7.1+2
  [ce943373] Qt6ShaderTools_jll v6.7.1+1
  [e99dba38] Qt6Wayland_jll v6.7.1+1
  [f50d1b31] Rmath_jll v0.5.1+0
⌅ [fb77eaff] Sundials_jll v5.2.2+0
  [a44049a8] Vulkan_Loader_jll v1.3.243+0
  [a2964d1f] Wayland_jll v1.21.0+1
  [2381bf8a] Wayland_protocols_jll v1.31.0+0
  [02c8fc9c] XML2_jll v2.13.4+0
  [aed1982a] XSLT_jll v1.1.41+0
  [ffd25f8a] XZ_jll v5.6.3+0
  [f67eecfb] Xorg_libICE_jll v1.1.1+0
  [c834827a] Xorg_libSM_jll v1.2.4+0
  [4f6342f7] Xorg_libX11_jll v1.8.6+0
  [0c0b7dd1] Xorg_libXau_jll v1.0.11+0
  [935fb764] Xorg_libXcursor_jll v1.2.0+4
  [a3789734] Xorg_libXdmcp_jll v1.1.4+0
  [1082639a] Xorg_libXext_jll v1.3.6+0
  [d091e8ba] Xorg_libXfixes_jll v5.0.3+4
  [a51aa0fd] Xorg_libXi_jll v1.7.10+4
  [d1454406] Xorg_libXinerama_jll v1.1.4+4
  [ec84b674] Xorg_libXrandr_jll v1.5.2+4
  [ea2f1a96] Xorg_libXrender_jll v0.9.11+0
  [14d82f49] Xorg_libpthread_stubs_jll v0.1.1+0
  [c7cfdc94] Xorg_libxcb_jll v1.17.0+0
  [cc61e674] Xorg_libxkbfile_jll v1.1.2+0
  [e920d4aa] Xorg_xcb_util_cursor_jll v0.1.4+0
  [12413925] Xorg_xcb_util_image_jll v0.4.0+1
  [2def613f] Xorg_xcb_util_jll v0.4.0+1
  [975044d2] Xorg_xcb_util_keysyms_jll v0.4.0+1
  [0d47668e] Xorg_xcb_util_renderutil_jll v0.3.9+1
  [c22f9ab0] Xorg_xcb_util_wm_jll v0.4.1+1
  [35661453] Xorg_xkbcomp_jll v1.4.6+0
  [33bec58e] Xorg_xkeyboard_config_jll v2.39.0+0
  [c5fb5394] Xorg_xtrans_jll v1.5.0+0
  [3161d3a3] Zstd_jll v1.5.6+1
  [1e29f10c] demumble_jll v1.3.0+0
  [35ca27e7] eudev_jll v3.2.9+0
  [214eeab7] fzf_jll v0.53.0+0
  [1a1c6b14] gperf_jll v3.1.1+0
  [a4ae2306] libaom_jll v3.9.0+0
  [0ac62f75] libass_jll v0.15.2+0
  [1183f4f0] libdecor_jll v0.2.2+0
  [2db6ffa8] libevdev_jll v1.11.0+0
  [f638f0a6] libfdk_aac_jll v2.0.3+0
  [36db933b] libinput_jll v1.18.0+0
  [b53b4c65] libpng_jll v1.6.44+0
  [f27f6e37] libvorbis_jll v1.3.7+2
  [009596ad] mtdev_jll v1.1.6+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
⌅ [1270edf5] x264_jll v2021.5.5+0
⌅ [dfaa095f] x265_jll v3.5.0+0
  [d8fb68d0] xkbcommon_jll v1.4.1+1
  [0dad84c5] ArgTools v1.1.1
  [56f22d72] Artifacts
  [2a0f44e3] Base64
  [ade2ca70] Dates
  [8ba89e20] Distributed
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching
  [9fa8497b] Future
  [b77e0a4c] InteractiveUtils
  [4af54fe1] LazyArtifacts
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2
  [8f399da3] Libdl
  [37e2e46d] LinearAlgebra
  [56ddb016] Logging
  [d6f4376e] Markdown
  [a63ad114] Mmap
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.10.0
  [de0858da] Printf
  [9abbd945] Profile
  [3fa0cd96] REPL
  [9a3f8284] Random
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization
  [1a1011a3] SharedArrays
  [6462fe0b] Sockets
  [2f01184e] SparseArrays v1.10.0
  [10745b16] Statistics v1.10.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test
  [cf7118a7] UUIDs
  [4ec0a83e] Unicode
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.4.0+0
  [e37daf67] LibGit2_jll v1.6.4+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.2+1
  [14a3606d] MozillaCACerts_jll v2023.1.10
  [4536629a] OpenBLAS_jll v0.3.23+4
  [05823500] OpenLibm_jll v0.8.1+2
  [efcefdf7] PCRE2_jll v10.42.0+1
  [bea87d4a] SuiteSparse_jll v7.2.1+1
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.11.0+0
  [8e850ede] nghttp2_jll v1.52.0+1
  [3f19e933] p7zip_jll v17.4.0+2

Output of versioninfo()

Julia Version 1.10.5
Commit 6f3fdf7b362 (2024-08-27 14:19 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 96 × Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, cascadelake)
Threads: 48 default, 0 interactive, 24 GC (on 96 virtual cores)

ChrisRackauckas commented 1 week ago

Therefor the solver shouldn't compute the full matrix exponential of the given matrix operator.

Did you set krylov=true?

mcarmesin commented 1 week ago

solve(prob, LinearExponential(krylov=:simple)) fails with a different error:

ERROR: This object is not a GPU array
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] backend(::Type)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:225
  [3] backend(::Type{SubArray{Float32, 1, Matrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true}})
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:229
  [4] backend(x::SubArray{Float32, 1, Matrix{Float32}, Tuple{Base.Slice{Base.OneTo{Int64}}, Int64}, true})
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:226
  [5] _copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:78 [inlined]
  [6] materialize!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:38 [inlined]
  [7] materialize!
    @ ./broadcast.jl:911 [inlined]
  [8] firststep!(Ks::KrylovSubspace{…}, V::SubArray{…}, H::SubArray{…}, b::CuArray{…})
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:171
  [9] lanczos!(Ks::KrylovSubspace{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, b::CuArray{…}; tol::Float64, m::Int64, opnorm::Nothing, init::Int64, t::Float64, mu::Float64, l::Int64)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:331
 [10] lanczos!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:318 [inlined]
 [11] arnoldi!(Ks::KrylovSubspace{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, b::CuArray{…}; tol::Float64, m::Int64, ishermitian::Bool, opnorm::Function, iop::Int64, init::Int64, t::Float64, mu::Float64, l::Int64)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/arnoldi.jl:246
 [12] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:801
 [13] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [14] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [15] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [16] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [17] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [18] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [19] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [20] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [21] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [22] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [23] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:12

while solve(prob, LinearExponential(krylov=:adaptive)) yields again

ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
  [6] copyto_unaliased!(deststyle::IndexLinear, dest::SubArray{…}, srcstyle::IndexLinear, src::CuArray{…})
    @ Base ./abstractarray.jl:1088
  [7] copyto!
    @ ./abstractarray.jl:1068 [inlined]
  [8] phiv_timestep!(U::CuArray{…}, ts::Vector{…}, A::CUDA.CUSPARSE.CuSparseMatrixCSC{…}, B::CuArray{…}; tau::Float64, m::Int64, tol::Float32, opnorm::typeof(LinearAlgebra.opnorm), iop::Int64, correct::Bool, caches::Tuple{…}, adaptive::Bool, delta::Float64, ishermitian::Bool, gamma::Float64, NA::Int64, verbose::Bool)
    @ ExponentialUtilities ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:170
  [9] #expv_timestep!#39
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:54 [inlined]
 [10] expv_timestep!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:51 [inlined]
 [11] #expv_timestep!#38
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:48 [inlined]
 [12] expv_timestep!
    @ ~/.julia/packages/ExponentialUtilities/xLH9y/src/krylov_phiv_adaptive.jl:46 [inlined]
 [13] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{…}, cache::OrdinaryDiffEq.LinearExponentialCache{…}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:805
 [14] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/perform_step/linear_perform_step.jl:790 [inlined]
 [15] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{…})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:553
 [16] __solve(::ODEProblem{…}, ::LinearExponential; kwargs::@Kwargs{})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:7
 [17] __solve
    @ ~/.julia/packages/OrdinaryDiffEq/tAI61/src/solve.jl:1 [inlined]
 [18] #solve_call#44
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:612 [inlined]
 [19] solve_call
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:569 [inlined]
 [20] #solve_up#53
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1092 [inlined]
 [21] solve_up
    @ ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1078 [inlined]
 [22] solve(prob::ODEProblem{…}, args::LinearExponential; sensealg::Nothing, u0::Nothing, p::Nothing, wrap::Val{…}, kwargs::@Kwargs{})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1015
 [23] solve(prob::ODEProblem{…}, args::LinearExponential)
    @ DiffEqBase ~/.julia/packages/DiffEqBase/frOsk/src/solve.jl:1005
 [24] top-level scope
    @ ~/CN/test_linear-exponential_solver.jl:12
Some type information was truncated. Use `show(err)` to see complete types.

ChrisRackauckas commented 1 week ago

Okay so it looks like the root of the issue is that phiv! needs to support this better.

ChrisRackauckas commented 1 week ago

Let's narrow this down to something that's just ExponentialUtilties.jl? There are some basic expv tests:

https://github.com/SciML/ExponentialUtilities.jl/blob/master/test/gpu/gputests.jl#L40-L57

so something must be missed by the tests.

mcarmesin commented 5 days ago

At least for krylov=:simple, the problem is that the V component of the KrylovSubspace is not constructed on the GPU. We have to alter the function alg_cache(alg::LinearExponential, …) in order to take care of constructing the KrylovSubspace in the right way, see my PR [https://github.com/SciML/OrdinaryDiffEq.jl/pull/2538]().

SciML / OrdinaryDiffEq.jl

LinearExponential() solver broken for CUDA sparse matrices #2523