JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.12k stars 221 forks source link

Experience with SimulatedAnnealing? #173

Closed pkofod closed 6 years ago

pkofod commented 8 years ago

I've been running some benchmarks on all our methods, and find that SImmulatedAnnealing is doing very poorly. Does anyone have experience with using the method as implemented here? I'm kind of suspecting that a bug has been introduced at some point. I will post code when I have something presentable, but many well-known test problems have been tested, and it often runs for the maximum number of iterations.

Edit: I should try to increase the number of iterations... Maybe 1000 is just not enough for the method to converge?

johnmyleswhite commented 8 years ago

Simulated annealing almost never works well in practice, but it would be good to know if there's a bug in our implementation. Perhaps contrast a run of our method with R's implementation in the optim function?

Top culprits are: (a) the neighbor function and (b) the annealing schedule function.

multidis commented 8 years ago

Simulated annealing almost never works well in practice

As with any global heuristic, it depends on how one runs it. Generalized (C. Tsallis, Physica A 233 (1996) 395-406) simulated annealing based on a generalization of Gibbs statistics can work well. Some examples can be found in the GenSA R-package, Xiang et al. R Journal 5/1 (2013) 13-28 where it outperforms differential evolution. And there is an exact random number generator (RNG) for that distribution (T. Schanze, Comp. Phys. Comm. 175 (2006) 708–712).

I started the implementation based the Optim.jl "simple" simulated annealing function a year ago but never had time to complete. Thought it would be good to have at least one global method in Optim.jl, conceptually different from DE implementation in other Julia packages. Would be glad to share existing code including the RNG with testing if someone has interest.

pkofod commented 8 years ago

I would love to see it, and compare with our current implementation. I will also return with a bit more specific examples when I get the time to write it up.

Edit: is it the Black box package your talking about?

multidis commented 8 years ago

Sure, I'll collect what I have in the next few days.

pkofod commented 8 years ago

Okay, I didn't have much time this evening, so I cannot make a complete report, but I might have been a bit too harsh on SimulatedAnnealing (correct spelling this time...). Generally, it's in the middle of the field, so to speak. Accelerated gradient descent is almost always closest to the minimizer (and consequently has the smallest "minimum error"), and is quite fast as well. Nelder Mead is quite often in the second place.

I am really interested in making a good overview of the current performance and quality of the package, so I will return with numbers etc. I just want to make sure it's somewhat thorough before posting it. Maybe I simply chose those test functions that were easiest to type, and that might favor Nelder Mead and Acc. GD in some way. These preliminary results are also with numerical gradients, and I am curious to see if analytical gradients can help not only performance but also precision.

I think I am digressing from my own issue. I will report back in a different "How well does our package perform?"-issue.

maxlarsen commented 8 years ago

May be of interest: someone posted simulated annealing code in julia-users https://groups.google.com/forum/#!searchin/julia-users/annealing/julia-users/s-yqaXalTN4/1cT5nZ9fCgAJ

johnmyleswhite commented 8 years ago

I'm always pretty uncomfortable opening a file of code that's been posted without an explicit license declaration first. Since the code is in an Octave package that seems to have GPL code in it, I'm especially wary.

pkofod commented 8 years ago

Why is that ? Is it GPL that has the "you cannot implement this if you've seen it already" clause ?

StefanKarpinski commented 8 years ago

If you've seen a GPL implementation, there's some chance that someone could argue in court that your code is a derived work of that GPL implementation and therefore should be under the GPL as well. It's probably more paranoid than necessary, but it's best to be careful about these things.

pkofod commented 8 years ago

Ok, thanks for the clarification Stefan! Can the person (Michael Creel in this case) publish it with another license, or do they also restrict themselves from publishing it under an MIT license for example?

johnmyleswhite commented 8 years ago

If Michael Creel is the sole copyright holder of the code, they can relicense it. If N other people have worked on the code, then all N must consent to the weakened license.

pkofod commented 8 years ago

Okay. So I guess this means that if his GPL'ed implementation is based on a GPL'ed Fortran program by William Goffe, Gary Ferrier, and John Rogers, we would need permission from all four persons? The original Fortran version doesn't seem to carry any license information what-so-ever, though.

StefanKarpinski commented 8 years ago

No license means you can't use it at all so that's unfortunately not helpful.

pkofod commented 8 years ago

Michael Creel, who wrote the GPL'ed code, is currently asking for permission from the authors of the code he based his version of off. Would be cool to see substantial improvements to a method in Optim.jl. It's been a while.

Edit: The three authors a (Creel, Goffe, Tsionas) are OK with a rewrite/revision being licensed under the MIT license. I think Creel is going to post his code with the MIT license (in julia-users?), and then we'll see if the algorithm is fit for Optim.jl.

multidis commented 8 years ago

Regarding my earlier comment on Generalized Simulated Annealing: if anyone is interested, here is the code: https://github.com/multidis/SATsallis.jl

Sampling from q-Gaussian distribution utilizes Distributions.jl and is tested vs. the original paper (please see references in the README).