JuliaSmoothOptimizers / Krylov.jl

A Julia Basket of Hand-Picked Krylov Methods
Other
338 stars 51 forks source link

`block_gmres` fails with `LinearOperator` #854

Closed gdalle closed 3 months ago

gdalle commented 4 months ago

I think this is a LinearOperators.jl issue, right?

julia> using Krylov, LinearOperators

julia> A, B = rand(2, 2), rand(2, 2);

julia> block_gmres(A, B)
([0.041370389388870915 0.25094348322476523; 0.10937029411860183 0.31680627052050375], SimpleStats
 niter: 1
 solved: true
 inconsistent: false
 residuals: []
 Aresiduals: []
 κ₂(A): []
 timer: 160.56μs
 status: solution good enough given atol and rtol
)

julia> block_gmres(LinearOperator(A), B)
ERROR: MethodError: no method matching mul!(::Matrix{…}, ::LinearOperator{…}, ::Matrix{…}, ::Bool, ::Bool)

Closest candidates are:
  mul!(::AbstractMatrix, ::LinearAlgebra.AbstractTriangular, ::AbstractMatrix, ::Number, ::Number)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/triangular.jl:736
  mul!(::AbstractMatrix, ::Union{LinearAlgebra.Bidiagonal, LinearAlgebra.Diagonal, LinearAlgebra.SymTridiagonal, LinearAlgebra.Tridiagonal}, ::AbstractMatrix, ::Number, ::Number)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/bidiag.jl:427
  mul!(::AbstractVecOrMat, ::LinearAlgebra.UniformScaling, ::AbstractVecOrMat, ::Number, ::Number)
   @ LinearAlgebra ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/uniformscaling.jl:284
  ...

Stacktrace:
 [1] mul!
   @ ~/.julia/juliaup/julia-1.10.2+0.x64.linux.gnu/share/julia/stdlib/v1.10/LinearAlgebra/src/matmul.jl:237 [inlined]
 [2] block_gmres!(solver::BlockGmresSolver{…}, A::LinearOperator{…}, B::Matrix{…}; M::LinearAlgebra.UniformScaling{…}, N::LinearAlgebra.UniformScaling{…}, ldiv::Bool, restart::Bool, reorthogonalization::Bool, atol::Float64, rtol::Float64, itmax::Int64, timemax::Float64, verbose::Int64, history::Bool, callback::Krylov.var"#29#36", iostream::Core.CoreSTDOUT)
   @ Krylov ~/.julia/packages/Krylov/pv2NF/src/block_gmres.jl:242
 [3] block_gmres!
   @ ~/.julia/packages/Krylov/pv2NF/src/block_gmres.jl:115 [inlined]
 [4] block_gmres(A::LinearOperator{…}, B::Matrix{…}; memory::Int64, M::LinearAlgebra.UniformScaling{…}, N::LinearAlgebra.UniformScaling{…}, ldiv::Bool, restart::Bool, reorthogonalization::Bool, atol::Float64, rtol::Float64, itmax::Int64, timemax::Float64, verbose::Int64, history::Bool, callback::Krylov.var"#29#36", iostream::Core.CoreSTDOUT)
   @ Krylov ~/.julia/packages/Krylov/pv2NF/src/block_gmres.jl:110
 [5] block_gmres(A::LinearOperator{…}, B::Matrix{…})
   @ Krylov ~/.julia/packages/Krylov/pv2NF/src/block_gmres.jl:105
 [6] top-level scope
   @ REPL[7]:1
Some type information was truncated. Use `show(err)` to see complete types.
gdalle commented 4 months ago

It probably slipped through the cracks because block_gmres is not tested, see #842

amontoison commented 4 months ago

@gdalle We never defined linear operator * matrix products in LinearOperators.jl. I will have a look 🙂

gdalle commented 4 months ago

Thanks! Indeed I had traced it back to https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/issues/322 but I didn't know if this was a design decision or an oversight

gdalle commented 4 months ago

Hey @amontoison, gentle ping cause I was curious to see if you had been able to investigate? I thought it would just require removing some type annotations in the 3-arg and 5-arg mul! of LinearOperators.jl, but there are some more involved changes with the storage_type that I am not confident enough to suggest

amontoison commented 4 months ago

Hi @gdalle! Sorry, I didn't find time to investigate. Is it possible to implement your own operator temporary? You just need to define mul!, size and eltype.

I have an example here: https://github.com/JuliaSmoothOptimizers/KrylovPreconditioners.jl/blob/main/ext/CUDA/operators.jl#L1

dpo commented 4 months ago

@gdalle We never defined linear operator * matrix products in LinearOperators.jl. I will have a look 🙂

We did define * here: https://github.com/JuliaSmoothOptimizers/LinearOperators.jl/blob/main/src/operations.jl#L132. By convention, we (or I) decided long ago that the result should be an operator. However, nothing prevents us from generalizing

mul!(res::AbstractVector, op::AbstractLinearOperator{T}, v::AbstractVector, α, β)

to matrices.

amontoison commented 4 months ago

In Krylov.jl, we only rely on mul! so we need a constructor that takes a input a function that represent:

mul!(res::AbstractMatrix, op::AbstractLinearOperator{T}, v::AbstractMatrix, α, β)

An issue is that, unlike vectors, matrices can have a different number of columns, and the operators can be dedicated to only certain sizes of matrices. For example, in KrylovPreconditioners.jl, the operators that perform triangular solves (on GPU) only work for a specific number of columns because we preallocate the internal CUSPARSE / rocSPARSE buffer for the associated number of right-hand sides. But a check of the size of the matrices can be done directly in the function provided by the user.

Guillaume, can you give more details on your linear operator?

gdalle commented 4 months ago

Sorry for not answering, and thank you for taking an interest ☺️ My linear operator is the JVP/VJP for a vector-to-vector function. It is naturally defined on a single vector, but with AD packages like ForwardDiff and Enzyme, we can also apply "batched" versions which amount to multiplying the Jacobian with several vectors at once, aka a matrix. I can definitely roll out my own mul!, but won't I lose other things by not using an official LinearOperator? More generally, how useful is block_gmres if it only applies to actual non-lazy matrices?

gdalle commented 3 months ago

I opened a PR in LinearOperators.jl to start working on a fix, would love to get some help @amontoison @dpo!

gdalle commented 3 months ago

Now that the LinearOperators PR has been merged, this seems to work:

julia> using Krylov, LinearOperators

julia> A, B = rand(2, 2), rand(2, 2);

julia> block_gmres(A, B)
([0.1341553080088699 0.48826584325663686; 0.9880393854740158 0.3756128570051308], SimpleStats
 niter: 1
 solved: true
 inconsistent: false
 residuals: []
 Aresiduals: []
 κ₂(A): []
 timer: 12.10ms
 status: solution good enough given atol and rtol
)

julia> block_gmres(LinearOperator(A), B)
([0.1341553080088699 0.48826584325663686; 0.9880393854740158 0.3756128570051308], SimpleStats
 niter: 1
 solved: true
 inconsistent: false
 residuals: []
 Aresiduals: []
 κ₂(A): []
 timer: 498.99ms
 status: solution good enough given atol and rtol
)

Closing in favor of more specific issues in the future

amontoison commented 3 months ago

Did you tested with a Krylov method that requires A' like lsmr?

gdalle commented 3 months ago

I thought block_gmres was the only one of its kind? https://jso.dev/Krylov.jl/stable/block_krylov/

amontoison commented 3 months ago

Oh, you're right 🥴 I forgot that I didn't implemented a block method that requires A' yet...

gdalle commented 3 months ago

well, get cracking!