Closed sethaxen closed 3 years ago
As an example, consider the following two implementations of invquad
:
# what PDMats currently uses
invquad(a::PDMat, x::AbstractVector) = sum(abs2, a.chol.L \ x)
# what I propose
function invquad(a::PDMat, x::AbstractVector)
chol = a.chol
return sum(abs2, (chol.uplo === 'L' ? chol.L : transpose(chol.U)) \ x)
end
Now let's benchmark for a 100x100 matrix:
julia> A = PDMat(exp(Symmetric(randn(100, 100))));
julia> x = randn(100);
julia> @btime invquad($A, $x)
The results are
17.435 μs (4 allocations: 79.09 KiB) # current version
2.441 μs (3 allocations: 928 bytes) # new version
The methods specialized on
PDMat
frequently access theU
andL
properties of the stored cholesky factorchol
.Cholesky
has anuplo
parameter that determines whether it stores the upper or lower Cholesky factor. Ifchol.uplo=='U'
, thenchol.U
gives anUpperTriangular
view of the stored factor, whilechol.L
allocates a transposed copy of this matrix. The opposite is true forchol.L
anduplo == 'L'
. To avoid unnecessary copies, thePDMat
methods should check theuplo
to decide whether to call, e.g.chol.U
orchol.L'