JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.13k stars 221 forks source link

Nelder-Mead gradient converged? #239

Closed multidis closed 8 years ago

multidis commented 8 years ago

In the past, Nelder-Mead was returning f_converged around here. Currently, Nelder-Mead returns gradient-converged in this test for instance.

I could have missed something within the recent API change commits, but it seems confusing that a gradient-free method exits with the gradient convergence flag. Was the change intentional or is it a bug? @pkofod

Also related to the show method.

pkofod commented 8 years ago

It is actually intended, but what is not intended is that it is not in the README! See how it is documented on master http://www.juliaopt.org/Optim.jl/latest/algo/nelder_mead/

Why are we doing it like this? Well, it is basically because there are 3 different convergence criterions: f_tol, g_tol, and x_tol. I figured it would be easier to simply use the g_tol for the main convergence measure in Nelder-Mead. The problem is, that if we use f_tol it will get the same default as other solvers, and that is a bit too small I think (1e-32). Additionally, I don't really think using f_tol would make that much more sense. I also want to be able to add a f_tol and x_tol based stopping criterion to NM much like in Matlab.

The convergence measure in NelderMead is not the gradient, but it does sort of measure how much the function varies around the centroid. That's why I didn't think it would be too bad to use g_tol as long as it was documented - it unfortunately isn't on the current stable release.

Edit: Of course we might eventually have a slightly different API for gradient free methods, however this is not really top priority.

pkofod commented 8 years ago

Also, if you're using NelderMead a lot, you may want to be aware of some of the changes in #220 . The initial step size is going to change, the constructor, and the values of the four different parameters will as well. It shouldn't really break anything unless you're setting the parameters manually.

multidis commented 8 years ago

I guess one can think of the convergence check resembling the gradient, although it is less than a straightforward connection. The f_tol default is of course too small for Nelder-Mead, I'd imagine anyone using the method would change that.

A Readme note would indeed help remember this aspect (I had a number of tests failing that were expecting f_converged to be true). Thank you for clarifying.

pkofod commented 8 years ago

I guess one can think of the convergence check resembling the gradient, although it is less than a straightforward connection. The f_tol default is of course too small for Nelder-Mead, I'd imagine anyone using the method would change that.

A Readme note would indeed help remember this aspect (I had a number of tests failing that were expecting f_converged to be true). Thank you for clarifying.

Yes, this is quite clearly a mistake on my part, it should have been documented properly. At least it should be in the master docs.

Also, thanks for reporting this. It's great to get feedback to see where things might go wrong/be unclear in the wild!