SciML / NonlinearSolve.jl

High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
https://docs.sciml.ai/NonlinearSolve/stable/
MIT License
238 stars 42 forks source link

TrustRegion bugs #210

Closed FHoltorf closed 1 year ago

FHoltorf commented 1 year ago

I had a look at https://github.com/SciML/NonlinearSolve.jl/issues/142. Indeed there were some mistakes in how the Gauss-Newton step and Cauchy point were calculated. This is still WIP, however, since TrustRegion is still not up to par with nlsolve's implementation after these changes.

I suspect a large issue for the poor performance is the default TR adaptation and step acceptance/rejection routine; for instance the current default is rather conservative with respect to which steps are being accepted. Lowering the step_threshold default to 1e-4 from 0.1 (and after correcting the GN/Cauchy step computations) the performance on the test set is already quite a bit better.

Before

nr  Problem                                            n     NewtonRaphson       TrustRegion         LevenbergMarquardt  nlsolve
1   Generalized Rosenbrock function                    10    3.9833e-311         4.9193              4.9193              0.0
2   Powell singular function                           4     7.0435e-16          7.941               0.021633            7.0217e-16
3   Powell badly scaled function                       2     0.0                 26897.0             2.1983e-14          0.0
4   Wood function                                      4     4.1832e-14          19081.0             8550.5              8.2157e-15
5   Helical valley function                            3     6.2775e-30          50.0                50.0                4.3667e-26
6   Watson function                                    2     18860.0             3004.4              129.15              114.89
7   Chebyquad function                                 2     5.5511e-17          0.075044            0.34127             9.4369e-16
8   Brown almost linear function                       10    0.0                 0.01533             16.53               4.4409e-16
9   Discrete boundary value function                   10    5.4548e-16          0.0047026           6.5869e-15          2.7756e-16
10  Discrete integral equation function                10    2.7807e-15          1.9166e-15          9.3334e-15          5.5511e-17
11  Trigonometric function                             10    1.4753e-15          2.2808e-15          0.084117            6.8695e-16
12  Variably dimensioned function                      10    0.0                 2.8636e105          1.376e-12           0.0
13  Broyden tridiagonal function                       10    1.2803e-15          2.7487e-15          1.1633e-13          6.6613e-16
14  Broyden banded function                            10    1.4002e-15          2.1729e-15          1.6865e-13          5.5511e-16
15  Hammarling 2 by 2 matrix square root problem       4     2.2204e-16          0.32355             0.91487             2.2204e-16
16  Hammarling 3 by 3 matrix square root problem       9     2.2204e-16          0.30203             1.0506              2.2204e-16
17  Dennis and Schnabel 2 by 2 example                 2     0.0                 4.4408e-16          17.262              0.0
18  Sample problem 18                                  2     0.0                 6.2862e-9           2.1951              8.291e-17
19  Sample problem 19                                  2     9.5092e-16          3.4505e-7           76.367              7.7402e-16
20  Scalar problem f(x) = x(x - 5)^2                   1     0.0                 1.8919e-16          2.5564e-18          0.0
21  Freudenstein-Roth function                         2     0.0                 20.012              20.012              4.949
22  Boggs function                                     2     0.0                 1.6031              2.0                 3.9431e-17
23  Chandrasekhar function                             10    6.6613e-16          9.9301e-16          1.0906e-13          2.2204e-16

After

nr  Problem                                            n     NewtonRaphson       TrustRegion         LevenbergMarquardt  nlsolve             
1   Generalized Rosenbrock function                    10    4.0179e-314         0.0050578           4.9193              0.0                 
2   Powell singular function                           4     7.0435e-16          2.1234e-6           0.021633            7.0217e-16          
3   Powell badly scaled function                       2     0.0                 0.0                 9.9944e-15          0.0                 
4   Wood function                                      4     4.1138e-14          4287.4              8550.5              8.2157e-15          
5   Helical valley function                            3     6.2775000000000006e-301.5158e-18          50.0                4.3667e-26          
6   Watson function                                    2     18979.0             28248.0             129.15              114.89              
7   Chebyquad function                                 2     5.5511e-17          1.6653e-16          0.34127             9.4369e-16          
8   Brown almost linear function                       10    0.0                 1.1102e-16          16.53               4.4409e-16          
9   Discrete boundary value function                   10    5.4548e-16          6.9388e-17          7.2775e-15          2.7756e-16          
10  Discrete integral equation function                10    2.7807e-15          3.4694e-17          4.8191e-15          5.5511e-17          
11  Trigonometric function                             10    1.4753e-15          6.2384e-16          0.084117            8.3961e-16          
12  Variably dimensioned function                      10    0.0                 0.0                 1.3104e-12          0.0                 
13  Broyden tridiagonal function                       10    1.2803e-15          1.1376e-15          1.1957e-13          6.6613e-16          
14  Broyden banded function                            10    1.4002e-15          1.3824e-15          1.665e-13           5.5511e-16          
15  Hammarling 2 by 2 matrix square root problem       4     2.2204e-16          0.017726            0.91487             2.2204e-16          
16  Hammarling 3 by 3 matrix square root problem       9     2.2204e-16          0.0093482           1.0506              2.2204e-16          
17  Dennis and Schnabel 2 by 2 example                 2     0.0                 0.0                 17.262              0.0                 
18  Sample problem 18                                  2     0.0                 1.413e-15           2.1951              7.658e-18           
19  Sample problem 19                                  2     9.5092e-16          0.00013496          76.367              7.7402e-16          
20  Scalar problem f(x) = x(x - 5)^2                   1     0.0                 1.9702e-24          2.5564e-18          0.0                 
21  Freudenstein-Roth function                         2     0.0                 20.012              20.012              4.949               
22  Boggs function                                     2     0.0                 7.3505e-17          2.0                 3.9431e-17          
23  Chandrasekhar function                             10    6.6613e-16          7.6918e-16          1.2718e-13          2.2204e-16 

Will investigate further.

codecov[bot] commented 1 year ago

Codecov Report

Merging #210 (f5c66c4) into master (d1e0762) will decrease coverage by 0.96%. The diff coverage is 92.70%.

@@            Coverage Diff             @@
##           master     #210      +/-   ##
==========================================
- Coverage   94.94%   93.98%   -0.96%     
==========================================
  Files           8        8              
  Lines         732      782      +50     
==========================================
+ Hits          695      735      +40     
- Misses         37       47      +10     
Files Coverage Δ
src/trustRegion.jl 95.91% <92.70%> (-2.97%) :arrow_down:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

FHoltorf commented 1 year ago

rebased and now formally correct.

Last things to do:

FHoltorf commented 1 year ago

I think I have caught all bugs now.

When using the NLsolve trust region scheme, the performance is comparable to NLsolve.jl (see below). There are slight differences which are due to different convergence criteria, I believe. The other trust region schemes perform similarly now (usually a little worse on this test set but not to a meaningful enough point where I would suspect more bugs).

nr  Problem                                            n     NewtonRaphson       TrustRegion         LevenbergMarquardt  nlsolve             
1   Generalized Rosenbrock function                    10    NaN                 0.0                 4.9193              0.0                 
2   Powell singular function                           4     7.0435e-16          7.0435e-16          0.021633            7.0217e-16          
3   Powell badly scaled function                       2     0.0                 0.0                 9.9944e-15          0.0                 
4   Wood function                                      4     4.1138e-14          1.2921e-14          8550.5              1.1546e-14          
5   Helical valley function                            3     6.2775000000000006e-301.0408e-16          50.0                1.0408e-16          
6   Watson function                                    2     NaN                 74.676              NaN                 114.89              
7   Chebyquad function                                 2     5.5511e-17          5.5511e-17          0.34127             5.5511e-17          
8   Brown almost linear function                       10    0.0                 8.8817e-16          16.53               8.8818e-16          
9   Discrete boundary value function                   10    5.4548e-16          5.4548e-16          7.2775e-15          2.7756e-16          
10  Discrete integral equation function                10    2.7807e-15          2.7807e-15          4.8191e-15          5.5511e-17          
11  Trigonometric function                             10    1.4753e-15          2.6733e-15          0.084117            8.4655e-16          
12  Variably dimensioned function                      10    0.0                 0.0                 1.3104e-12          0.0                 
13  Broyden tridiagonal function                       10    1.2803e-15          1.2803e-15          1.1957e-13          8.8818e-16          
14  Broyden banded function                            10    1.4002e-15          1.4002e-15          1.665e-13           8.8818e-16          
15  Hammarling 2 by 2 matrix square root problem       4     2.2204e-16          6.8564e-8           0.91487             6.853e-8            
16  Hammarling 3 by 3 matrix square root problem       9     2.2204e-16          3.7631e-29          1.0506              3.7632000000000004e-29
17  Dennis and Schnabel 2 by 2 example                 2     0.0                 0.0                 17.262              0.0                 
18  Sample problem 18                                  2     0.0                 0.0                 2.1951              0.0                 
19  Sample problem 19                                  2     9.5092e-16          9.5092e-16          76.367              6.7241e-16          
20  Scalar problem f(x) = x(x - 5)^2                   1     0.0                 0.0                 2.5564e-18          0.0                 
21  Freudenstein-Roth function                         2     0.0                 6.9988              20.012              4.949               
22  Boggs function                                     2     0.0                 1.077e-16           2.0                 1.077e-16           
23  Chandrasekhar function                             10    6.6613e-16          6.6613e-16          1.2718e-13          4.4409e-16    

Just for reference, here is the before

nr  Problem                                            n     NewtonRaphson       TrustRegion         LevenbergMarquardt  nlsolve             
1   Generalized Rosenbrock function                    10    NaN                 4.9193              4.9193              0.0                 
2   Powell singular function                           4     7.0435e-16          7.941               0.021633            7.0217e-16          
3   Powell badly scaled function                       2     0.0                 4111.8              9.9944e-15          0.0                 
4   Wood function                                      4     4.1138e-14          2985.5              8550.5              1.1546e-14          
5   Helical valley function                            3     6.2775000000000006e-3050.0                50.0                1.0408e-16          
6   Watson function                                    2     NaN                 6491.3              129.15              114.89              
7   Chebyquad function                                 2     5.5511e-17          0.075044            0.34127             5.5511e-17          
8   Brown almost linear function                       10    0.0                 0.0                 16.53               8.8818e-16          
9   Discrete boundary value function                   10    5.4548e-16          0.0046978           7.2775e-15          2.7756e-16          
10  Discrete integral equation function                10    2.7807e-15          8.1217e-17          4.8191e-15          5.5511e-17          
11  Trigonometric function                             10    1.4753e-15          3.1524e-15          0.084117            8.4655e-16          
12  Variably dimensioned function                      10    0.0                 0.0                 1.3104e-12          0.0                 
13  Broyden tridiagonal function                       10    1.2803e-15          1.143e-15           1.1957e-13          8.8818e-16          
14  Broyden banded function                            10    1.4002e-15          1.4002e-15          1.665e-13           8.8818e-16          
15  Hammarling 2 by 2 matrix square root problem       4     2.2204e-16          0.32355             0.91487             6.853e-8            
16  Hammarling 3 by 3 matrix square root problem       9     2.2204e-16          0.30203             1.0506              3.7632000000000004e-29
17  Dennis and Schnabel 2 by 2 example                 2     0.0                 4.4408e-16          17.262              0.0                 
18  Sample problem 18                                  2     0.0                 5.5995e-18          2.1951              0.0                 
19  Sample problem 19                                  2     9.5092e-16          1.0542e-6           76.367              6.7241e-16          
20  Scalar problem f(x) = x(x - 5)^2                   1     0.0                 1.9702e-24          2.5564e-18          0.0                 
21  Freudenstein-Roth function                         2     0.0                 20.012              20.012              4.949               
22  Boggs function                                     2     0.0                 1.6031              2.0                 1.077e-16           
23  Chandrasekhar function                             10    6.6613e-16          4.965e-16           1.2718e-13          4.4409e-16
ChrisRackauckas commented 1 year ago

Is the full 23 function thing a test already? If not, can we make it a regression test please?

FHoltorf commented 1 year ago

I don't think they are. It would hit a lot (if not all) of the code though which would be nice. I am wondering though: How do we distinguish failure from success since we can't expect all methods to converge to a small residual on all problems?

avik-pal commented 1 year ago

What about we use the list you have above and mark our solvers that currently fail as test_broken (I know it is not exactly correct since some of them are expected to not converge and aren't exactly broken)? That way, we won't have a regression in the future of the currently working versions