JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.12k stars 217 forks source link

Nelder-Mead uses (default) gradient stopping rule? #88

Closed BenConnault closed 9 years ago

BenConnault commented 9 years ago
Optim.optimize(x->x[1]^2+x[2]^2,[1.,1.])
Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [1.0,1.0]
 * Minimum: [0.16666666666666666,0.16666666666666666]
 * Value of Function at Minimum: 0.055556
 * Iterations: 4
 * Convergence: true
   * |x - x'| < NaN: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-08: true
   * |g(x)| < NaN: false
   * Exceeded Maximum Number of Iterations: false
 * Objective Function Calls: 9
 * Gradient Call: 0

And worse:

Optim.optimize(x->x[1]^2+x[2]^2,[1.,1.],grtol=1e-14)
Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [1.0,1.0]
 * Minimum: [0.16666666666666666,0.16666666666666666]
 * Value of Function at Minimum: 0.055556
 * Iterations: 4
 * Convergence: true
   * |x - x'| < NaN: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-08: true
   * |g(x)| < NaN: false
   * Exceeded Maximum Number of Iterations: false
 * Objective Function Calls: 9
 * Gradient Call: 0
johnmyleswhite commented 9 years ago

All functions currently share a convergence function. That could be changed or we could just make the grtol impossible to reach.

BenConnault commented 9 years ago

Oops, my mistake, I mistook grtol for ftol. It sounds perfectly reasonable to share a convergence function with grtol put to NaN for Nelder-Mead, as it is now.

However:

Optim.optimize(x->x[1]^2+x[2]^2,[1.,1.],ftol=1e-20)
Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [1.0,1.0]
 * Minimum: [0.16666666666666666,0.16666666666666666]
 * Value of Function at Minimum: 0.055556
 * Iterations: 4
 * Convergence: true
   * |x - x'| < NaN: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-20: true
   * |g(x)| < NaN: false
   * Exceeded Maximum Number of Iterations: false
 * Objective Function Calls: 9
 * Gradient Call: 0
johnmyleswhite commented 9 years ago

Nelder-Mead's performance is somewhat erratic. This seems unusually bad, but I suspect the problem is that your initial point is exceptionally bad. An equivalent problem after translation started at [0.0, 0.0] gives the right answer:

julia> Optim.optimize(x->(x[1] - 1)^2 + (x[2] - 1)^2,[0., 0.])
Results of Optimization Algorithm
 * Algorithm: Nelder-Mead
 * Starting Point: [0.0,0.0]
 * Minimum: [1.0000185603003118,1.000024398605831]
 * Value of Function at Minimum: 0.000000
 * Iterations: 28
 * Convergence: true
   * |x - x'| < NaN: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-08: true
   * |g(x)| < NaN: false
   * Exceeded Maximum Number of Iterations: false
 * Objective Function Calls: 58
 * Gradient Call: 0

It's also possible that the way we translate user initial points into real initial points is problematic.

johnmyleswhite commented 9 years ago

Also, xtol for Nelder-Mead seems a little tricky to define since Nelder-Mead doesn't maintain a point, but an entire simplex -- with many different simplices that can have the same center.

BenConnault commented 9 years ago

I see, perfect thanks.