Closed koehlerson closed 1 year ago
The way I was thinking to reduce to a single number for more complicated convergence criteria would be to set the nlsolver's tolerance to e.g. 1.0, and then the user should define
calculate_convergence_criterion(problem) = max(energynorm(problem)/problem.energytol, residualnorm(problem)/problem.residualtol, incrementnorm(problem)/problem.incrementtol, ...)
I had it return a bool first, but that makes it harder for the solver to (a) log the convergence and (b) tune the tolerance.
The latter could be done by having check_convergence_criterion(problem, tol)
, but the former would require the user to keep track? (a) is of course not a required feature, but I often find that it is nice to have.
I see your points with logging the tolerance, however, we can introduce a function that tracks, besides the residualnorm, any other metric the user wants by returning them. This can be nicely incorporated with ProgressMeter.jl
as soon as I get to including it.
What I don't like about doing the calculate_convergence_criterion(problem)
is that it mixes up fields of structs. I think the tolerance should be a field of the solver, which it anyhow needs to be if you set it to e.g. 1.0. Then there is indeed a field tol
which is (more or less) everytime 1.0 and is used for comparison. Then, you have also the "real" tolerances in the problem
. However, could also be just my personal taste.
I look forward to ProgressMeter.jl
! Just had a quick look and it looks really nice. Definitely nice to include such logging there!
I think the tolerance should be a field of the solver
I agree that the structs get a bit mixed up. To me, it isn't fully clear whether the tolerance belongs to the solver or to the problem, but I'm also gravitating towards the solver owning it. How would you do that if it only returns a bool? Would it then be check_convergence_criterion(problem, solver.tol)
?
Maybe I misunderstand - I count one extra tol
, why would you have a "tolerance [that] should be a field of the solver" in addition to tol
- wouldn't that be the same?
I realize that for what I wrote, a better way of specifying it is to say that the problem should have scaling factors. Basically,
calculate_convergence_criterion(problem) = max(energynorm(problem)/problem.energy_scaling, ...)
and then the tolerance field should be relative, e.g. 1.e-6 (as it defaults to now) (in my opinion a tolerance, if belonging to the solver, should always be compared to a relative value, even though I'm often too lazy to do so:))
Using two different functions is of course an option, but I think that it is nice to have to specify as few functions as possible.
Maybe I misunderstand - I count one extra tol, why would you have a "tolerance [that] should be a field of the solver" in addition to tol - wouldn't that be the same?
Let my try to explain the issue I see here. Having a single tolerance is just the most basic convergence criterion. However, most solvers utilize more elaborate criteria, like for example a combination of relative and absolute tolerances as convergence criteria, which may even be rescaled on information from the previous solve (see e.g. Eisenstat-Walker's Inexact Newton). Technically speaking we do not even have the hard requirement that the convergence criterion has to be expressible as a number (or vector of numbers) - it holds just true if your convergence criterion is some sort of distance function. Think about topological convergence criteria like for example continuity (not an expert on this tho). So in general we have check_convergence_criterion(problem, solver)
.
I feel that we are pulling in two different directions here, also including the discussion on getsystemmatrix
(#5). I'm thinking of simplifying for a very basic user, and I have a feeling that you guys are making sure we have a general API that can take all special cases for advanced users and which we won't run into issues with as more methods are added.
I think this is great! We can get the best of both worlds! How about separating the API into a basic and advanced? My suggested definition/separation would be that the basic API should never require the user to write functions that accept a solver, while the advanced API would do exactly this. For some advanced methods, it is then OK that only the advanced API is available.
For the current discussions, this could be something like
check_convergence_criteria(problem, solver)
getsystemmatrix(problem,solver)
calculate_convergence_measure(problem)
getjacobian(problem)
, getdescentpreconditioner(problem)
Internally, we always use the advanced API by defining the default functions, e.g.
function check_convergence_criteria(problem, solver)
r = calculate_convergence_measure(problem)
update_state!(solver, r)
return r < gettolerance(solver)
end
getsystemmatrix(problem, ::FerriteSolver{<:Newton}) = getjacobian(problem) # Actually with traits
getsystemmatrix(problem, ::FerriteSolver{<:Descent}) = getdescentpreconditioner(problem)
check_convergence_criterion(problem, solver)->Bool
can now be specified by the user so I think this can be closed.
https://github.com/KnutAM/FerriteSolvers.jl/blob/main/src/nlsolvers.jl#L51
I think it won't be possible to reduce things to a plain scalar in
calculate_convergence_criterion
since different metrics will have different threshhold values in terms of convergence, e.g., the change of increment (\Delta \Delta d) may be checked by a smaller tolerance than the residual one. The same holds true when thinking about residualnorm and energynorm for example. In general, it is not possible to use the smallest values since you may experience a small value of the residualnorm and a higher value for the energynorm but the latter is maybe already "converged" in the user's sense.So @termi-official and me suggest to define instead of
calculate_convergence_criterion
acheck_convergence_criterion
that returns a bool