Anderson instability example

niklasschmitz commented 3 years ago

The following example shows the Anderson method converging on a high-dimensional non-linear problem, but failing to converge on its linearized adjoint (In the example we are solving for the derivative d obj(p) / d p with implicit differentiation). For reference, gmres is included.

using NLsolve
using SparseArrays
using LinearAlgebra
using Random
Random.seed!(1234)

const N = 10000
const nonlin = 0.1
const A = spdiagm(0 => fill(10.0, N), 1 => fill(-1.0, N-1), -1 => fill(-1.0, N-1))
const p0 = randn(N)
h(x, p) = A*x + nonlin*x.^2 - p
solve_x(p) = nlsolve(x -> h(x, p), zeros(N), method=:anderson, m=10, show_trace=true).zero
obj(p) = sum(solve_x(p))

obj(p0)

using IterativeSolvers
B = (A + Diagonal(2*nonlin*solve_x(p0)))'
y = ones(N)
g1, _ = gmres(B, y, log=true, verbose=true, abstol=1e-13, reltol=0)
g2 = nlsolve(v -> (B*v - y), zeros(N), method=:anderson, m=10, show_trace=true).zero

@show sum(abs, g1 - g2) / N
# N =   100: 8.880254906418195e-11
# N =  1000: 4.746847543507516e-10
# N = 10000: nlsolve error

For smaller problems, N=100 and N=1000, both methods agree, whereas on N=10000 only gmres converges. Here is the traces for gmres

=== gmres ===
rest    iter    resnorm
  1       1     3.11e-01
  1       2     4.06e-02
  1       3     4.35e-03
  1       4     4.48e-04
  1       5     4.52e-05
  1       6     4.58e-06
  1       7     4.61e-07
  1       8     4.67e-08
  1       9     4.79e-09
  1      10     4.87e-10
  1      11     4.91e-11
  1      12     4.99e-12
  1      13     5.08e-13
  1      14     5.37e-14

and nlsolve

Iter     f(x) inf-norm    Step 2-norm 
------   --------------   --------------
     1     1.000000e+00              NaN
     2     1.000012e+01     8.100692e+05
     3     1.375946e+00     1.130588e+01
     4     1.403715e-01     1.938613e-01
     5     1.432037e-02     2.241785e-03
     6     1.466135e-03     2.377179e-05
     7     1.456896e-04     2.428532e-07
     8     1.480540e-05     2.489175e-09
     9     1.483146e-06     2.519266e-11
    10     1.496069e-07     2.578480e-13
    11     2.751325e-06     1.441309e-10
    12     3.135931e-05     1.550156e-08
    13     3.751740e-04     2.281951e-06
    14     4.195186e-04     2.857294e-06
    15     4.204133e-04     2.870207e-06
    16     2.892968e-04     1.523972e-06
    17     3.773098e-04     2.306723e-06
    18     3.612664e-04     2.127565e-06
    19     8.344509e+03     2.021603e+09
    20     1.033198e+05     3.202527e+11
    21     1.278560e+06     5.117236e+13
    22     1.582129e+07     8.232996e+15
    23     1.582744e+07     8.239587e+15
# off to Infinity
ERROR: LoadError: During the resolution of the non-linear system, the evaluation of the following equation(s) resulted in a non-finite number: # all indices from 1..10000

Pkg status:

[42fd0dbc] IterativeSolvers v0.9.1
[2774e3e8] NLsolve v4.5.1
[90137ffa] StaticArrays v1.2.3

cc @antoine-levitt

antoine-levitt commented 3 years ago

(extracted from https://github.com/JuliaNLSolvers/NLsolve.jl/issues/205)

antoine-levitt commented 3 years ago

This very very contrived example is well conditioned as a linear problem (symetric matrix with eigenvalues continuously filling an interval 8 and 12). It is also a very bad and unrealistic example of Anderson usage, because the unaccelerated iteration is monotonically divergent (rather than oscillatingly divergent, which is the usual case of application of Anderson). In particular, switching the sign of the linear objective function in the final nlsolve fixes this issue. Still, it would be nice if Anderson could cope better with pathological cases (although that is possibly a hard problem).

antoine-levitt commented 3 years ago

Looks like what's going on is that anderson is going along smoothly as long as it trusts its own extrapolation. As the solution gets closer to the true one, regularization starts kicking in, and makes anderson trust it less; it then degrades to the unaccelerated iteration, which diverges. This is a good test for improving regularization heuristics.

longemen3000 commented 2 years ago

There is a variant of the Anderson method that claims to be more stable https://stanford.edu/~boyd/papers/pdf/scs_2.0_v_global.pdf .

antoine-levitt commented 2 years ago

I think this is along the direction of being more careful and regularizing more, which here would be counterproductive.

JuliaNLSolvers / NLsolve.jl

Anderson instability example #273