odlgroup / odl

Operator Discretization Library https://odlgroup.github.io/odl/
Mozilla Public License 2.0
366 stars 105 forks source link

Classes for termination rules #51

Closed adler-j closed 8 years ago

adler-j commented 8 years ago

Copied from odl-solvers issue 4

Likewise step length rules, it will be equally useful to define a common interface for termination rules of optimization schemes, provided that there is some common structure.

aringh commented 8 years ago

A normal way to determine termination is to look at how close one are to fulfil the optimality conditions (For example the necessary KKT conditions). For unconstrained problems, this is simply how close to 0 the gradient is.

Maybe one could do somethings similar as in #50, where the termination rule keeps track of what it needs in order to stop, and simply returns some kind of scalar value which the algorithm can use to determine if it should terminate (in an unconstrained case, think of returning the mean or the worst case component of the gradient. If this is close to 0, we stop).

kohr-h commented 8 years ago

Or keep track of the step length and stop when they fall below some threshold? Or use some discrepancy principle in the case of operator equations? The possibilities seem to be quite numerous also here.

aringh commented 8 years ago

I was talking to @adler-j about this the other day, also related to what @kohr-h wrote in #76. Maybe the most common interface for termination rules is simply to answer the question: "should I stop?", by simply returning True or Flase? In this way one could write termination rules on discrepancy principles for Landweber, or check size of the gradient for simple unconstrained optimization problems.

kohr-h commented 8 years ago

Yes, absolutely. I don't think there is any discussion on what a termination rule should return. My question or suggestion was more concerned with the input side. Some unification will be helpful there.

adler-j commented 8 years ago

We discussed having some kind of dictionary as input, with well specified names of what things should be called. The class can then check for the things it needs, perhaps residual or current_function_value, and if they don't exist we throw an error (duck-typing style).

kohr-h commented 8 years ago

I think dictionaries with standardized key names are a good and lightweight way to start. One thing we have to think about is how we specify how many of some quantity we need to store and how we update the state dictionary. For example, methods which use approximate Hessians usually need the previous and current gradients, while others may only require the current one. Storing everything is not a solution, of course, so somehow a method has to communicate with its step length and termination rules and "negotiate" the minimal amount of stored arrays.

adler-j commented 8 years ago

How are we doing here. Do we honestly think this will happen at some point, and wouldn't it be better to simply open a new issue if that is the case?

kohr-h commented 8 years ago

I don't see anyone doing it, and it's not really high prio currently. Agree to closing and re-opening when it becomes relevant again.