SciML / NonlinearSolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
https://docs.sciml.ai/NonlinearSolve/stable/
MIT License
238 stars 42 forks source link

Add advanced tutorials #213

Closed ErikQQY closed 1 year ago

ErikQQY commented 1 year ago

What is the correct way of adding a preconditioner for linear solvers in NonlinearSolve.jl? I wrote this tutorial based on solving large stiff ODE in diffeq docs. I think the preconditioners should be the same since we have a common LinearSolve.jl interface, but the preconditioners failed when I passed them to solve.

MWE:

using NonlinearSolve, LinearAlgebra, SparseArrays, LinearSolve

const N = 32
const xyd_brusselator = range(0, stop = 1, length = N)
brusselator_f(x, y) = (((x - 0.3)^2 + (y - 0.6)^2) <= 0.1^2) * 5.0
limit(a, N) = a == N + 1 ? 1 : a == 0 ? N : a
function brusselator_2d_loop(du, u, p)
    A, B, alpha, dx = p
    alpha = alpha / dx^2
    @inbounds for I in CartesianIndices((N, N))
        i, j = Tuple(I)
        x, y = xyd_brusselator[I[1]], xyd_brusselator[I[2]]
        ip1, im1, jp1, jm1 = limit(i + 1, N), limit(i - 1, N), limit(j + 1, N),
        limit(j - 1, N)
        du[i, j, 1] = alpha * (u[im1, j, 1] + u[ip1, j, 1] + u[i, jp1, 1] + u[i, jm1, 1] -
                       4u[i, j, 1]) +
                      B + u[i, j, 1]^2 * u[i, j, 2] - (A + 1) * u[i, j, 1] +
                      brusselator_f(x, y)
        du[i, j, 2] = alpha * (u[im1, j, 2] + u[ip1, j, 2] + u[i, jp1, 2] + u[i, jm1, 2] -
                       4u[i, j, 2]) +
                      A * u[i, j, 1] - u[i, j, 1]^2 * u[i, j, 2]
    end
end
p = (3.4, 1.0, 10.0, step(xyd_brusselator))

function init_brusselator_2d(xyd)
    N = length(xyd)
    u = zeros(N, N, 2)
    for I in CartesianIndices((N, N))
        x = xyd[I[1]]
        y = xyd[I[2]]
        u[I, 1] = 22 * (y * (1 - y))^(3 / 2)
        u[I, 2] = 27 * (x * (1 - x))^(3 / 2)
    end
    u
end
u0 = init_brusselator_2d(xyd_brusselator)
prob_brusselator_2d = NonlinearProblem(brusselator_2d_loop, u0, p)
using Symbolics
du0 = copy(u0)
jac_sparsity = Symbolics.jacobian_sparsity((du, u) -> brusselator_2d_loop(du, u, p),
    du0, u0)
ff = NonlinearFunction(brusselator_2d_loop; jac_prototype = float.(jac_sparsity))
prob_brusselator_2d_sparse = NonlinearProblem(ff, u0, p)
using IncompleteLU
function incompletelu(W, du, u, p, t, newW, Plprev, Prprev, solverdata)
    if newW === nothing || newW
        Pl = ilu(convert(AbstractMatrix, W), τ = 50.0)
    else
        Pl = Plprev
    end
    Pl, nothing
end
solve(prob_brusselator_2d_sparse,
    NewtonRaphson(linsolve = KrylovJL_GMRES(), precs = incompletelu,
        concrete_jac = true));
codecov[bot] commented 1 year ago

Codecov Report

Merging #213 (d1910b4) into master (5c89c59) will increase coverage by 1.08%. Report is 2 commits behind head on master. The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #213      +/-   ##
==========================================
+ Coverage   19.86%   20.95%   +1.08%     
==========================================
  Files           8        8              
  Lines         735      735              
==========================================
+ Hits          146      154       +8     
+ Misses        589      581       -8     

see 2 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

avik-pal commented 1 year ago

@ErikQQY let's rebase this!

ChrisRackauckas commented 1 year ago

https://github.com/SciML/NonlinearSolve.jl/actions/runs/6422207412/job/17438236170?pr=213#step:5:29

Looks like concrete_jac is ignored.

avik-pal commented 1 year ago

Looks like concrete_jac is ignored.

I will look into this

avik-pal commented 1 year ago
using NonlinearSolve, LinearAlgebra, SparseArrays, LinearSolve, Symbolics, IncompleteLU

const N = 32
const xyd_brusselator = range(0, stop=1, length=N)

brusselator_f(x, y) = (((x - 0.3)^2 + (y - 0.6)^2) <= 0.1^2) * 5.0
limit(a, N) = ifelse(a == N + 1, 1, ifelse(a == 0, N, a))

function brusselator_2d_loop(du, u, p)
    A, B, alpha, dx = p
    alpha = alpha / dx^2
    @inbounds for I in CartesianIndices((N, N))
        i, j = Tuple(I)
        x, y = xyd_brusselator[I[1]], xyd_brusselator[I[2]]
        ip1, im1, jp1, jm1 = limit(i + 1, N), limit(i - 1, N), limit(j + 1, N),
        limit(j - 1, N)
        du[i, j, 1] = alpha * (u[im1, j, 1] + u[ip1, j, 1] + u[i, jp1, 1] + u[i, jm1, 1] -
                               4u[i, j, 1]) +
                      B + u[i, j, 1]^2 * u[i, j, 2] - (A + 1) * u[i, j, 1] +
                      brusselator_f(x, y)
        du[i, j, 2] = alpha * (u[im1, j, 2] + u[ip1, j, 2] + u[i, jp1, 2] + u[i, jm1, 2] -
                               4u[i, j, 2]) +
                      A * u[i, j, 1] - u[i, j, 1]^2 * u[i, j, 2]
    end
end

p = (3.4, 1.0, 10.0, step(xyd_brusselator))

function init_brusselator_2d(xyd)
    N = length(xyd)
    u = zeros(N, N, 2)
    for I in CartesianIndices((N, N))
        x = xyd[I[1]]
        y = xyd[I[2]]
        u[I, 1] = 22 * (y * (1 - y))^(3 / 2)
        u[I, 2] = 27 * (x * (1 - x))^(3 / 2)
    end
    u
end

u0 = init_brusselator_2d(xyd_brusselator)
prob_brusselator_2d = NonlinearProblem(brusselator_2d_loop, u0, p)

ff = NonlinearFunction(brusselator_2d_loop)

prob_brusselator_2d_sparse = NonlinearProblem(ff, u0, p)

# Default no sparsity
@benchmark solve($prob_brusselator_2d, $NewtonRaphson())
# BenchmarkTools.Trial: 22 samples with 1 evaluation.
#  Range (min … max):  217.598 ms … 293.832 ms  ┊ GC (min … max): 0.00% … 6.07%
#  Time  (median):     227.871 ms               ┊ GC (median):    0.00%
#  Time  (mean ± σ):   233.451 ms ±  17.872 ms  ┊ GC (mean ± σ):  1.34% ± 1.85%

#   █   ▃▃█  ▃
#   █▁▁▁███▇▇█▇▁▇▇▁▁▇▁▁▁▁▁▇▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
#   218 ms           Histogram: frequency by time          294 ms <

#  Memory estimate: 32.72 MiB, allocs estimate: 5190.

# Use symbolics for sparsity
@benchmark solve($prob_brusselator_2d_sparse,
    $NewtonRaphson(; autodiff=AutoSparseForwardDiff()))
# BenchmarkTools.Trial: 14 samples with 1 evaluation.
# Range (min … max):  321.055 ms … 407.954 ms  ┊ GC (min … max): 0.00% … 9.52%
# Time  (median):     377.523 ms               ┊ GC (median):    1.97%
# Time  (mean ± σ):   370.549 ms ±  28.095 ms  ┊ GC (mean ± σ):  4.71% ± 4.92%

#     ▁     ▁    █                    ▁   ▁  ▁▁  ▁  ▁▁      ▁▁    ▁
#     █▁▁▁▁▁█▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁█▁▁██▁▁█▁▁██▁▁▁▁▁▁██▁▁▁▁█ ▁
#     321 ms           Histogram: frequency by time          408 ms <

# Memory estimate: 118.91 MiB, allocs estimate: 1698721.

# Internally switch to non-sparse AD for JacVec
@benchmark solve($prob_brusselator_2d_sparse,
    $NewtonRaphson(; linsolve=KrylovJL_GMRES(), autodiff=AutoSparseForwardDiff()))
# BenchmarkTools.Trial: 206 samples with 1 evaluation.
# Range (min … max):  22.760 ms … 31.461 ms  ┊ GC (min … max): 0.00% … 20.72%
# Time  (median):     23.727 ms              ┊ GC (median):    0.00%
# Time  (mean ± σ):   24.290 ms ±  1.559 ms  ┊ GC (mean ± σ):  2.00% ±  4.52%

#     ▁     ▄█▇▂
#     █▇▇▁▇▇████▇▄▆▇▄▁▄▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▄▇███▆▁▁▁▁▁▁▁▁▁▁▁▁▄▁▁▁▄ ▆
#     22.8 ms      Histogram: log(frequency) by time        30 ms <

# Memory estimate: 7.58 MiB, allocs estimate: 8952.

# Using Preconditioner requiring concrete Jacobian
function incompletelu(W, du, u, p, t, newW, Plprev, Prprev, solverdata)
    if newW === nothing || newW
        Pl = ilu(W, τ=50.0)
    else
        Pl = Plprev
    end
    return Pl, nothing
end

@benchmark solve($prob_brusselator_2d_sparse,
    $NewtonRaphson(linsolve=KrylovJL_GMRES(), precs=incompletelu, concrete_jac=true,
        autodiff=AutoSparseForwardDiff()))

# BenchmarkTools.Trial: 17 samples with 1 evaluation.
# Range (min … max):  256.001 ms … 365.592 ms  ┊ GC (min … max): 0.00% … 22.01%
# Time  (median):     303.444 ms               ┊ GC (median):    0.00%
# Time  (mean ± σ):   303.785 ms ±  38.993 ms  ┊ GC (mean ± σ):  7.23% ±  8.32%

#     ▁  █▁ ▁  ▁ ▁     ▁        █  ▁            ▁▁       ▁    ▁ ▁ ▁
#     █▁▁██▁█▁▁█▁█▁▁▁▁▁█▁▁▁▁▁▁▁▁█▁▁█▁▁▁▁▁▁▁▁▁▁▁▁██▁▁▁▁▁▁▁█▁▁▁▁█▁█▁█ ▁
#     256 ms           Histogram: frequency by time          366 ms <

# Memory estimate: 123.53 MiB, allocs estimate: 1707967.

Needs #229

avik-pal commented 1 year ago

Profiling the results, it seems all the time is spent in computing the sparsity pattern in Symbolics

ChrisRackauckas commented 1 year ago

Well then for now we should show how to compute the sparsity pattern ahead of time as an optimization.

avik-pal commented 1 year ago

# How much time is spent in the sparsity detection?
du0 = copy(u0)
jac_sparsity = Symbolics.jacobian_sparsity((du, u) -> brusselator_2d_loop(du, u, p),
    du0, u0)
ff = NonlinearFunction(brusselator_2d_loop; jac_prototype=float.(jac_sparsity))
prob_brusselator_2d_sparse = NonlinearProblem(ff, u0, p)

@benchmark solve($prob_brusselator_2d_sparse,
    $NewtonRaphson(linsolve=KrylovJL_GMRES(), precs=incompletelu, concrete_jac=true,
        autodiff=AutoSparseForwardDiff()))
# BenchmarkTools.Trial: 273 samples with 1 evaluation.
# Range (min … max):  16.504 ms … 33.019 ms  ┊ GC (min … max): 0.00% … 46.31%
# Time  (median):     17.419 ms              ┊ GC (median):    0.00%
# Time  (mean ± σ):   18.309 ms ±  2.084 ms  ┊ GC (mean ± σ):  3.99% ±  7.24%

#     ▂ █▁▃▁                                                      
#     ▇██████▇▅▃▂▁▃▂▂▃▂▃▃▅▃▃▂▅▅▄▄▄▃▄▃▂▃▂▁▁▁▁▁▁▂▁▁▂▁▁▁▁▁▂▁▂▁▁▁▂▁▁▂ ▃
#     16.5 ms         Histogram: frequency by time        25.8 ms <

# Memory estimate: 20.87 MiB, allocs estimate: 31283.

colorvec = ArrayInterface.matrix_colors(jac_sparsity)

ff = NonlinearFunction(brusselator_2d_loop; jac_prototype=float.(jac_sparsity), colorvec)
prob_brusselator_2d_sparse = NonlinearProblem(ff, u0, p)

@benchmark solve($prob_brusselator_2d_sparse,
    $NewtonRaphson(linsolve=KrylovJL_GMRES(), precs=incompletelu, concrete_jac=true,
        autodiff=AutoSparseForwardDiff()))
# BenchmarkTools.Trial: 323 samples with 1 evaluation.
# Range (min … max):  13.083 ms … 33.518 ms  ┊ GC (min … max): 0.00% … 55.24%
# Time  (median):     15.584 ms              ┊ GC (median):    0.00%
# Time  (mean ± σ):   15.488 ms ±  2.421 ms  ┊ GC (mean ± σ):  4.03% ±  9.14%

#     ▃▄▂       █▅█▁                                               
#     ███▇▇▄▃▃▃▇████▄▄▄▃▃▃▂▃▁▁▁▁▁▁▁▂▁▁▁▂▂▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▃▁▁▂▂▁▂ ▃
#     13.1 ms         Histogram: frequency by time        27.1 ms <

# Memory estimate: 16.09 MiB, allocs estimate: 559.
ChrisRackauckas commented 1 year ago

Yes so we should document that in the performance tutorial until it's better