What is a Nonlinear Solver? And How to easily build Newer Ones

avik-pal commented 7 months ago

Goals

[x] Fully Functional Iterator Interface
[x] Composable Internals. https://github.com/SciML/NonlinearSolve.jl/issues/330 and https://github.com/SciML/NonlinearSolve.jl/issues/216 got me thinking that our level of abstraction before was way off. All our algorithms are essentially the same (okay maybe except the spectral one, but even that can be made composable and we get a bunch of those methods for free) but we keep copying the code around
[ ] ~Dispatch system with safety to disable NLLS with NewtonRaphson (and similarly for others). Currently it will work which is not the worst thing, but disabling this is a safer option IMO~
[x] Actually functional reinit interface

How close is this to being ready?

Once tests pass and downstream PRs are merged we should be good to go. I will follow this PR up with:

SingleFactorize<....> which can use the iterator interface and recompute_jacobian to Jacobian only when needed.
Multi-Step Methods
Default change to PolyesterForwardDiff similar to SimpleNonlinearSolve

Documentation

[x] Dev Docs

Target Issues

[x] Fixes #312
[x] Setup for #247
[x] Setup for #241
[x] Fixes #62

Modular Internals

[x] Descent
- [x] Newton -- Automatically handle normal form efficiently
- [x] Cauchy
- [x] Dogleg -- Mix Newton and Cauchy
- [x] Damped Newton -- Yes Levenberg-Marquardt and PseudoTransient can share the same internals
- [x] Rework a little bit for different schemes
[x] Globalization: Only one can be activated at a time
- [x] Line Search
- [x] Trust Region

codecov[bot] commented 6 months ago

Codecov Report

Attention: 310 lines in your changes are missing coverage. Please review.

Comparison is base (b661754) 83.79% compared to head (1d7d202) 86.23%.

Files	Patch %	Lines
src/core/approximate_jacobian.jl	77.27%	35 Missing :warning:
src/globalization/trust_region.jl	84.50%	31 Missing :warning:
src/globalization/line_search.jl	83.94%	22 Missing :warning:
src/utils.jl	70.83%	21 Missing :warning:
src/internal/approximate_initialization.jl	85.27%	19 Missing :warning:
src/descent/damped_newton.jl	88.02%	17 Missing :warning:
src/algorithms/lbroyden.jl	78.37%	16 Missing :warning:
src/internal/operators.jl	86.11%	15 Missing :warning:
src/algorithms/levenberg_marquardt.jl	80.82%	14 Missing :warning:
src/descent/geodesic_acceleration.jl	71.42%	14 Missing :warning:
... and 21 more

Additional details and impacted files

```diff @@ Coverage Diff @@ ## master #345 +/- ## ========================================== + Coverage 83.79% 86.23% +2.44% ========================================== Files 28 44 +16 Lines 2179 2609 +430 ========================================== + Hits 1826 2250 +424 - Misses 353 359 +6 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

termi-official commented 6 months ago

I may have a special case here to consider and will write it here instead of as a separate issue due to the title of the PR. If I should transfer this information to an issue instead, I am happy to do it.

When discretizing PDEs we often end up with a nonlinear system of equations + constraints (coming from e.g. Dirichlet boundary conditions). For example in Newton methods we also know how to enforce the constraints efficiently: We can rewrite the linear equation system. In Ferrite.jl a typical Newton-Raphson implementation looks like this:

dh = ... # Holds info about finite element problem
un = zeros(ndofs(dh))
u = zeros(ndofs(dh))
Δu = zeros(ndofs(dh))
... # define remaining variables
dbcs = ... # Holds Dirichlet boundary condition information 
apply!(un, dbcs) # Apply Dirichlet conditions to solution vector
while true; newton_itr += 1
        # Construct the current guess
        u .= un .+ Δu
        # Compute residual and tangent for current guess
        assemble_global!(K, g, u, ...)
        # Apply boundary conditions
        apply_zero!(K, g, dbcs)
        # Compute the residual norm and compare with tolerance
        normg = norm(g)
        if normg < NEWTON_TOL
            break
        elseif newton_itr > NEWTON_MAXITER
            error("Reached maximum Newton iterations, aborting")
        end

        # Compute increment
        ΔΔu =  K \ g

        apply_zero!(ΔΔu, dbcs)
        Δu .-= ΔΔu
    end

(see https://ferrite-fem.github.io/Ferrite.jl/dev/tutorials/hyperelasticity/ for full example).

I am confident that other frameworks come with similar mechanisms. Basically the difference to a "normal" Newton-Raphson is the apply! before entering the iteration to e.g. set the boundary condition in the solution vector and the 2x apply_zero!, first to e.g. eliminate the boundary conditions from the linear system and second to remove numerical inaccurracies in the solver from the increment. If it is not clear: The increment must be zero, because the solution is already enforced to be correct by the first call to apply!. For e.g. gradient based solvers we need similar hooks.

With this in mind I would like to ask for considering this and possibly recommendations about how this integration can be accompished from user side when using NonlinearSolve.jl. I am happy to help here and also to answer any questions which might come up.

My first idea here was to provide a custom linear solver. This custom solver just wraps the actual linear solver and adds the apply_zero calls in the solve function (and when the linear system is regenerated). However, in this case it is not clear how to introduce the first apply! in a clean way before going into the iteration.

avik-pal commented 6 months ago

One way might be to provide a preamble / postamble functions with the input being cache. I think that might solve your problem @termi-official? Then you can construct the GeneralizedFirstOr.... method with NewtonDescent to get your NewtonRaphson.

The only concern here is that with custom functions, the inputs to the step! like recompute_jacobian might have to be set as incompatible -- unless we rework how the jacobian or fu is stored / cached.

@ChrisRackauckas do you have particular thoughts on this?

termi-official commented 6 months ago

So, if I understand correctly, then the current recommendation would be to define a custom linear solver wrapper, as I proposed, and then dispatch the preamble if the cache has this specific linear solver wrapper. I think this makes sense.

Indeed one thing to consider here is that the Dirichlet condition might be time-dependent themselves and this can possibly invalidate the Jacobian (depending on the reusage strategy). I am not sure how to deal with this one properly.

avik-pal commented 6 months ago

if I understand correctly, then the current recommendation would be to define a custom linear solver wrapper, as I proposed

I meant having the preamble function in the GeneralizedFirstOrder... algorithm itself. So you create your newton's method as:

descent = NewtonDescent(; linsolve, precs)
GeneralizedFirstOrderAlgorithm(; concrete_jac, name = :MyAmazingNewtonRaphson,
    linesearch, descent, jacobian_ad = autodiff, preamble = <....>)

avik-pal commented 6 months ago

That said I am anyways not adding any new functionality in the PR. Most of what is being added is what was incomplete or inconsistent with docs and such. @termi-official can you open a separate issue for this and link this PR, I will open a follow up PR for that?

ChrisRackauckas commented 6 months ago

I am confident that other frameworks come with similar mechanisms. Basically the difference to a "normal" Newton-Raphson is the apply! before entering the iteration to e.g. set the boundary condition in the solution vector and the 2x apply_zero!, first to e.g. eliminate the boundary conditions from the linear system and second to remove numerical inaccurracies in the solver from the increment. If it is not clear: The increment must be zero, because the solution is already enforced to be correct by the first call to apply!. For e.g. gradient based solvers we need similar hooks.

Those are just nonlinear preconditioners, with pre and post preconditioning.

avik-pal commented 6 months ago

Once https://github.com/SciML/DiffEqBase.jl/pull/994 is merged, this would be good to go

[x] Verify docs build
[x] Delete the Manifest file

I separated out the windows build since it is stalling for some reason and I don't have a windows locally for testing.

avik-pal commented 6 months ago

Let's go! Everything is working :tada:

avik-pal commented 6 months ago

battery_problem_work_precision Things that were not working before are also working now

ChrisRackauckas commented 6 months ago

A bunch of small comments and issues to open, otherwise good to go!

avik-pal commented 6 months ago

the dae failure is real https://github.com/SciML/NonlinearSolve.jl/actions/runs/7543078285/job/20533293590?pr=345#step:6:1759. The cache is initialized with a Vector but later it gets a subarray

ChrisRackauckas commented 6 months ago

the dae failure is real https://github.com/SciML/NonlinearSolve.jl/actions/runs/7543078285/job/20533293590?pr=345#step:6:1759. The cache is initialized with a Vector but later it gets a subarray

We should just make the cache via a view too.

ChrisRackauckas commented 6 months ago

Make an issue in OrdinaryDiffEq with the DAE one, I can take a look at that later today.

SciML / NonlinearSolve.jl