JuliaStats / PDMats.jl

Uniform Interface for positive definite matrices of various structures
Other
104 stars 43 forks source link

Simplify `cholesky` for `ScalMat` and `PDiagMat` #182

Closed devmotion closed 11 months ago

devmotion commented 11 months ago

We don't have to go through cholesky(::Diagonal) in LinearAlgebra (e.g. https://github.com/JuliaLang/julia/blob/28d9f730c1927297fc0cfc415c842968aa1a3a71/stdlib/LinearAlgebra/src/diagonal.jl#L868) when computing cholesky(::ScalMat) or cholesky(::PDiagMat) but we can construct the Cholesky object directly without any intermediate steps.

This improves performance significantly:

master

julia> using PDMats, BenchmarkTools, LinearAlgebra

julia> A = ScalMat(10, 3.2);

julia> @btime cholesky($A);
  47.733 ns (2 allocations: 288 bytes)

julia> A = ScalMat(100, 3.2);

julia> @btime cholesky($A);
  160.778 ns (2 allocations: 1.75 KiB)

julia> A = ScalMat(1_000, 3.2);

julia> @btime cholesky($A);
  1.458 μs (2 allocations: 15.88 KiB)

julia> A = PDiagMat(fill(3.2, 10));

julia> @btime cholesky($A);
  27.261 ns (1 allocation: 144 bytes)

julia> A = PDiagMat(fill(3.2, 100));

julia> @btime cholesky($A);
  116.997 ns (1 allocation: 896 bytes)

julia> A = PDiagMat(fill(3.2, 1_000));

julia> @btime cholesky($A);
  1.250 μs (1 allocation: 7.94 KiB)

Note also that cholesky(::PDiagMat) is faster than cholesky(::ScalMat) due to the reduced number of allocations - the only difference between both code paths is that for the ScalMat the diagonal fill(a.value, a.dim) is only constructed inside of the cholesky call whereas it already exists for the PDiagMat.

This PR

julia> using PDMats, BenchmarkTools, LinearAlgebra

julia> A = ScalMat(10, 3.2);

julia> @btime cholesky($A);
  16.450 ns (1 allocation: 144 bytes)

julia> A = ScalMat(100, 3.2);

julia> @btime cholesky($A);
  29.816 ns (1 allocation: 896 bytes)

julia> A = ScalMat(1_000, 3.2);

julia> @btime cholesky($A);
  164.808 ns (1 allocation: 7.94 KiB)

julia> A = PDiagMat(fill(3.2, 10));

julia> @btime cholesky($A);
  20.227 ns (1 allocation: 144 bytes)

julia> A = PDiagMat(fill(3.2, 100));

julia> @btime cholesky($A);
  74.040 ns (1 allocation: 896 bytes)

julia> A = PDiagMat(fill(3.2, 1_000));

julia> @btime cholesky($A);
  691.478 ns (1 allocation: 7.94 KiB)

Note that with this change, as expected, cholesky(::ScalMat) will be faster than the corresponding cholesky(::PDiagMat) since it only requires a single sqrt computation.

chriselrod commented 11 months ago

Can you use FillArrays.jl to avoid the allocations? That is, return a Cholesky(Diagonal(Fill(sqrt(sm.value), sm.dim)), :U, 0)?

cholesky(::ScalMat) should really be O(1) in runtime cost.

devmotion commented 11 months ago

Since PDMats is used by so mamy packages, I wanted to avoid introducing new dependencies and thereby increasing loading and compilation times. But maybe we should add FillArrays at least - I'm already convinced that it should definitely become a weak dependency and I'm actually preparing a PR right now 😅

chriselrod commented 11 months ago

I think the compilation time increase would be fairly negligible, other than load times.

> julia -e '@time using FillArrays'
  0.084586 seconds (114.33 k allocations: 7.589 MiB, 4.38% compilation time)
> julia -e '@time using PDMats'
  0.023690 seconds (34.43 k allocations: 2.185 MiB)
> julia --startup=no -e '@time using PDMats, FillArrays'
  0.100786 seconds (132.25 k allocations: 8.625 MiB, 3.80% compilation time)

FillArrays.jl only has stdlib dependencies, so it's a bit surprising that its load time is approaching 4x that of PDMats' on my computer.

This might be related to the fact that using PDMats has 0 invalidations, while using FillArrays results in many.

devmotion commented 11 months ago

I couldn't find an open issue regarding the invalidations of FillArrays, maybe it would be worthwhile to open an issue and try to reduce them? Generally, I would feel much better about adding a FillArrays dependency if it would at most double the loading time.

devmotion commented 11 months ago

Based on the timings in https://github.com/JuliaStats/PDMats.jl/pull/182/#issuecomment-1736061820, I think we shouldn't switch to FillArrays in this PR. The PR is already an improvement over the status quo, and we could track the remaining possible optimizations and discussion about FillArrays in a separate issue. What do you think @chriselrod?

codecov-commenter commented 11 months ago

Codecov Report

All modified lines are covered by tests :white_check_mark:

Files Coverage Δ
src/pdiagmat.jl 97.39% <100.00%> (ø)
src/scalmat.jl 97.33% <100.00%> (ø)

:loudspeaker: Thoughts on this report? Let us know!.