Question for GPU computation: lots of time on vector products

JuliaSmoothOptimizers / Krylov.jl

A Julia Basket of Hand-Picked Krylov Methods

Other

338 stars 51 forks source link

When I try to solve different positive definite linear system using different methods,

using SparseArrays, LinearAlgebra
using Krylov
using CUDA
using StatProfilerHTML

n = 10000
density = 0.005
L = sprand(n,n,density)
A = L'*L + spdiagm(0=> rand(n))
b = rand(n)
Ag = CUSPARSE.CuSparseMatrixCSR(A)
bg = CuVector(b)

msolver = MinresSolver(Ag,bg)
mqsolver = MinresQlpSolver(Ag,bg)
csolver = CgSolver(Ag,bg)
@profilehtml begin
    minres!(msolver,Ag,bg)

    minres_qlp!(mqsolver,Ag,bg)

    cg!(csolver,Ag,bg)
end

I found lots of time are spent on some dot operations (specifically to the computation of α in each solver), which is counterintuitive to me since a CPU version spends most of computation time on mul! operations and the dot product time is negligible. Is this the difference between running indirect methods on CPUs and GPUs or it is potentially a bug?

JuliaSmoothOptimizers / Krylov.jl

Question for GPU computation: lots of time on vector products #822