andim / noisyopt

Python library for optimizing noisy functions.
http://noisyopt.readthedocs.io/
MIT License
89 stars 15 forks source link

JOSS review: how to choose deltatol and niter? #9

Closed nirum closed 7 years ago

nirum commented 7 years ago

I was playing around with the stochastic quadratic function, and noticed that if deltatol is smaller than the variance of the added noise in the objective, then the algorithm will never converge (since even at the optimum, the function values are fluctuating too much). In general, we don't know what the variance in our objective is, so how should one go about choosing the deltatol parameter (or analogously, for minimizeSPSA, the niter parameter)? If it is too big, the algorithm exits prematurely, while if it is too small, it never returns. I don't know if you have tips on how to get around this problem, but adding some discussion in the documentation about the importance of deltatol might be helpful.

andim commented 7 years ago

You are right about the non-convergence if the function differences at distances of deltatol are smaller than the stochasticity. This problem is resolved if errorcontrol=True, but optimization still becomes exceedingly expensive if the variation of the function at the target accuracy is much smaller than the stochasticity. I added a comment about this to the docstring, see 0eba0e0a03c671b48d10d2015dbcbc134fab3878.

deltatol is the target pattern size, i.e. it determines how precisely the optimum is determined. It should be chosen as large as possible. I have added a description of the termination criteria to the docstring for more clarity, see 9599387b55d5067306d903f4dec37f8b25054284.

For the SPSA algorithm there is some discussion for suitable parameter choices in the literature (see http://www.jhuapl.edu/SPSA/PDF-SPSA/Spall_Implementation_of_the_Simultaneous.PDF), which I have now referenced in the docstring, see d3ce9bc12cf5ccaa2d906e249d3fc26be70fda42

nirum commented 7 years ago

Ah, interesting. Thanks for the references!