SKopecz / PositiveIntegrators.jl

A Julia library of positivity-preserving time integration methods
https://skopecz.github.io/PositiveIntegrators.jl/
MIT License
13 stars 4 forks source link

Performance: Optimize linear solve interface #59

Closed ranocha closed 5 months ago

ranocha commented 7 months ago

Right now, MPE uses

https://github.com/SKopecz/PositiveIntegrators.jl/blob/b821ff4664f1d2ccf983aefe9082b4140bd61269/src/mprk.jl#L329-L330

while MPRK22 still uses a simple backslash, e.g.,

https://github.com/SKopecz/PositiveIntegrators.jl/blob/b821ff4664f1d2ccf983aefe9082b4140bd61269/src/mprk.jl#L538

https://github.com/SKopecz/PositiveIntegrators.jl/blob/b821ff4664f1d2ccf983aefe9082b4140bd61269/src/mprk.jl#L548

We should use a unified interface. When doing so, we should consider the following aspects:

For example, init can be quite slow (since it sets up a lot of unnecessary caches):

julia> using BenchmarkTools, LinearSolve, LinearAlgebra

julia> A, b = let n = 100
           Tridiagonal(rand(n - 1), 1 .+ rand(n), rand(n - 1)), rand(n)
       end;

julia> @benchmark \($A, $b)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.062 μs … 348.083 μs  ┊ GC (min … max): 0.00% … 97.92%
 Time  (median):     3.260 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.566 μs ±   5.281 μs  ┊ GC (mean ± σ):  3.57% ±  2.71%

julia> prob = LinearProblem(A, b)

julia> alg = LinearSolve.defaultalg(prob.A, prob.b, OperatorAssumptions(issquare(prob.A)))

julia> @benchmark \($A, $b)
BenchmarkTools.Trial: 10000 samples with 8 evaluations.
 Range (min … max):  3.057 μs … 185.594 μs  ┊ GC (min … max): 0.00% … 96.90%
 Time  (median):     3.281 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.548 μs ±   4.232 μs  ┊ GC (mean ± σ):  3.26% ±  2.72%

   ▂█▇▄▅▁                                                      
  ▂███████▅▄▃▃▃▃▃▃▃▃▃▄▃▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  3.06 μs         Histogram: frequency by time        5.36 μs <

 Memory estimate: 5.34 KiB, allocs estimate: 8.

julia> @benchmark init($prob, $alg)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min … max):  386.333 μs …   7.192 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     411.500 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   430.208 μs ± 139.476 μs  ┊ GC (mean ± σ):  1.86% ± 5.96%

  ▄▇█▅▂▁                                                        ▂
  ████████▆▅▅▅▅▄▅▅▄▅▅▆▆▅▅▅▅▅▃▄▆▆▅▅▆▆▅▄▅▅▃▄▅▅▃▅▅▃▅▅▃▄▃▄▃▃▅▃▄▄▄▁▄ █
  386 μs        Histogram: log(frequency) by time       1.05 ms <

 Memory estimate: 373.34 KiB, allocs estimate: 108.

julia> @benchmark solve!($(init(prob, alg)))
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.629 μs …   5.946 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.642 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.676 μs ± 218.897 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▄ ▁                                                        ▁
  ████▄▁▁▁▁▁▁▃▁▃▁▄▃▁▃▁▁▁▁▁▁▁▄▁▃▁▁▃▇▅▅▆▆▆▇▆▄▆▆▁▆▄▅▃▁▁▅▃▄▃▄▃▅▄▄ █
  1.63 μs      Histogram: log(frequency) by time      2.82 μs <

 Memory estimate: 48 bytes, allocs estimate: 1.
SKopecz commented 5 months ago

Is this still an issue @ranocha ?

I think we only need to decide if we want to use the function dolinsolve or another implementation.

julia> using BenchmarkTools, LinearSolve, LinearAlgebra

julia> A, b = let n = 100
                  Tridiagonal(rand(n - 1), 1 .+ rand(n), rand(n - 1)), rand(n)
              end;

julia> @benchmark \($A, $b)
BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range (min … max):  2.080 μs … 211.003 μs  ┊ GC (min … max): 0.00% … 97.21%
 Time  (median):     2.248 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.441 μs ±   2.962 μs  ┊ GC (mean ± σ):  2.37% ±  2.70%

    ▁▆█▇▅▃▂▁▁▂▂▁▁          ▁                             ▁    ▂
  ▆▇███████████████▆▆▇██████▇▇▇▅▄▁▄▆▅▇▇███▆▇▆▆▇▇█▇▇████▇▇██▇▇ █
  2.08 μs      Histogram: log(frequency) by time      4.08 μs <

 Memory estimate: 5.34 KiB, allocs estimate: 8.

julia> prob = LinearProblem(A, b);

julia> alg = LUFactorization()
LUFactorization{RowMaximum}(RowMaximum())

julia> @benchmark init($prob, $alg)
BenchmarkTools.Trial: 10000 samples with 162 evaluations.
 Range (min … max):  635.994 ns …   5.842 μs  ┊ GC (min … max): 0.00% … 58.72%
 Time  (median):     667.778 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   723.696 ns ± 347.267 ns  ┊ GC (mean ± σ):  6.31% ± 10.43%

  █▅▂▁                                                          ▁
  ██████▇▇▆▅▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▁▁▁▁▁▁▁▁▁▁▁▄▄▅▁▄▁▃▁▁▁▁▁▁▁▁▁▁▆██ █
  636 ns        Histogram: log(frequency) by time       3.15 μs <

 Memory estimate: 5.12 KiB, allocs estimate: 13.

julia> @benchmark solve!($(init(prob, alg)))
BenchmarkTools.Trial: 10000 samples with 21 evaluations.
 Range (min … max):  958.952 ns …  2.026 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     960.381 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   991.783 ns ± 98.342 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █       ▁       ▁  ▂   ▁    ▁   ▁    ▁    ▁          ▁       ▁
  █▃▇▇▅█████▄▃█▅▄▄█▄▁█▇▃▁█▅▃▁▁█▁▄▁█▄▁▁▁█▄▁▁▃█▁▁▁▁█▃▁▁▁▁█▁▃▁▁▁▇ █
  959 ns        Histogram: log(frequency) by time      1.46 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.
ranocha commented 5 months ago

Is this still an issue @ranocha ?

  • I cannot confirm that \ is currently faster for Tridiagonal matrices, see below.
  • The overhead of LinearSolve.jl is due to LinearSolve.defaultalg. Since we now explicitly set the default linear solver, this is no longer an issue.

Yeah, it's basically fixed with https://github.com/SKopecz/PositiveIntegrators.jl/pull/60.

I think we only need to decide if we want to use the function dolinsolve or another implementation.

I agree, that's the last remaining decision. dolinsolve sets the preconditioners (requiring us to add the ugly hacks for them since we don't use them right now) and does a bunch of additional checks that we don't need for PositiveIntegrators.jl. I'm fine either way, though. What's your take on this - shall we just close this issue and continue using dolinsolve?

SKopecz commented 5 months ago

OK, let's stick with dolinsolve for now.