Open MichelJuillard opened 7 months ago
Appreciate the support.
Will it support using MKL.jl
and Accelerate.jl
?
I will test for MKL. I don't have the hardware to test for Accelerate. @RoyiAvital could you do that?
I will. Let me know when it is ready.
@RoyiAvital Sorry, I got confused with the actual syntax
factorize!()
returns a tuple with sytrf!()
output, not the factorization returned by BunchKaufman()
. It is therefor stillnecessary to call BunchKaufman()
after factorize!()
factorize!(ws, Symmetrical(x))
allocates more than `factorize!(ws, 'U', x)factorize!()
or the lower level direct call to LAPACK.sytr!()
using FastLapackInterface
using LinearAlgebra
function loop_1!(vXs, mCs, ws) for mC in mCs
F1 = factorize!(ws, 'U', mC)
F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))
# solving linear systems
for vX in vXs
ldiv!(F, vX)
end
end
end
function approach1( order, iterations)
ws = Workspace(LAPACK.sytrf!, mCs[1])
mCs_1 = deepcopy(mCs)
vXs_1 = deepcopy(vXs)
loop_1!(vXs_1, mCs_1, ws)
mCs_1 = deepcopy(mCs)
vXs_1 = deepcopy(vXs)
@time loop_1!(vXs_1, mCs_1, ws)
end
function loop_2!(vXs, mCs, ws) for mC in mCs
A, ipiv, info = LAPACK.sytrf!(ws, 'U', mC)
F = BunchKaufman(mC, ipiv, 'U', true, false, BLAS.BlasInt(0))
# solving linear systems
for vX in vXs
ldiv!(F, vX)
end
end
end
function approach2(order, iterations)
ws = BunchKaufmanWs(mCs[1])
mCs_1 = deepcopy(mCs)
vXs_1 = deepcopy(vXs)
loop_2!(vXs_1, mCs_1, ws)
mCs_1 = deepcopy(mCs)
vXs_1 = deepcopy(vXs)
@time loop_2!(vXs_1, mCs_1, ws)
end
order = 100 iterations = 10
mCs = [] vXs = [] for i = 1:iterations x = randn(order, order) mC = hermitianpart!(randn(n, n)).data push!(mCs, mC) push!(vXs, randn(order)) end
approach1(mCs, vXs) approach2(mCs, vXs)
5. It works with MKL (but seems slower than OpenBlas). Could you please try it with Accelerate?
Do these lines allocate?
F1 = factorize!(ws, 'U', mC)
F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))
ldiv!(F, vX)
If not, this is perfect.
I will test on Accelerate.jl
and report, no problem.
It still allocates for a reason that I don't understand but very little. It doesn't depend on the size of the matrix.
I assume F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))
is the allocating line, right?
F1 = factorize!(ws, 'U', mC)
allocates 64 bytes per iteration
F = BunchKaufman(mC, F1[2], 'U', true, false, BLAS.BlasInt(0))
allocate 48 bytes per iteration
factorize()
method is missing for BunchKaufmann (see https://discourse.julialang.org/t/ann-fastlapackinterface-jl-v1-0-0-non-allocating-lapack-factorizations/83354/10?u=micheljuillard and https://discourse.julialang.org/t/ann-fastlapackinterface-jl-v1-0-0-non-allocating-lapack-factorizations/83354/10?u=micheljuillard)