Open kalekundert opened 2 years ago
Here's an implementation of the probabilistic bisection algorithm (PBA):
https://gist.github.com/fgregg/54ee54b90b8a4fb938fc9b008ec95bd6
This algorithm was first described by Horstein in 1963 and is the focus of [Waeber2013]. It doesn't seem to be mentioned by [Pasupathy2011]. Note however that PBA (and the implementation linked above) requires that the probability of a correct response from the function (i.e. in terms of being on the right side of the root) is fixed and known. I don't think this information is available to me, though.
Noisyopt contains a simple stochastic bisection algorithm. It basically just evaluates the function a pre-determined number of times at each point, and uses the average to decide which direction to continue bisecting. This doesn't seem to account for uncertainty in a very sophisticated way, e.g. [Waeber2013] building up a posterior distribution of the location of the root.
I spent time today reading about three kinds of algorithms. Here are my takeaways:
I really value knowing how accurate the root is, so I'm most drawn to PBA. But none of these algorithms seem to have implementations available in python. I might be able to request implementations from the authors, but it would probably be best to just wait and see what becomes available over time.
After implementing this project, I realized that I have a root-finding problem and not an optimization problem. Briefly, my goal is for a given quantile of the atom displacement distribution to be equal to a given value. I expect the quantile to increase monotonically as the restraint weight decreases, and to at some point cross the desired value.
Currently I'm using a stochastic optimization algorithm (scikit-optimize). To turn the zero into a minimum, I take the absolute value of the objective function. However, I think this is a bad idea for a few reasons:
I looked into this briefly. This stack overflow post has two relevant references. The first is a review of stochastic root-finding algorithms [Pasupathy2011]. The second is a dissertation (which appears to be published later [Frazier2016]) on one particular algorithm [Waeber2013]. I don't think I'm going to implement either of these algorithms now, because I already have a reasonable result and I need to move on, but in the future I think it would be good to use an algorithm designed for the specific task I have.
Note also that I have a 1D root-finding problem, which is definitely simpler than higher-dimensional versions of the problem. Ideally, I'd find an algorithm tuned to the 1D case.