Thanks for this, it appears to be nearly as effective as the bayesian optimization method I was using for my problem yet with much less overhead. I had a few "theoretical " questions before I start trying to look deeper and expand on this:
Do you see any issue with limiting some parameters to integer values?
You say: "Simple does possess a hard requirement of needing to sample dim+1 corner points before optimization can proceed,"
-- To do Bayesian Optimization you could hotstart using a small random search and/or hand picked values. Would it make sense to incorporate a step like this into simple at some point of the process? If so, when? What about other schemes to avoid local minima like checking a random parameter set every n steps?
Rounding off some of the continuous design variables shouldn't cause too much of an issue, though if one of your discretized variables only has some small range such as 0:3, you might start to see some strange sample choices. This is because the exploration-exploitation tradeoff calculation is based on the amount of enclosed content within each simplex. By rounding off values on small ranges you make the amount of enclosed content a less accurate representation of how much that region of space has already been explored. For larger ranges, the discretization error will simply be handled as noise.
While I may be wrong, as far as I am aware the main reason why Bayesian optimization usually begins with random sampling is to help improve conditioning, rather than for reasons of optimization itself. As far as incorporating specified values and different shaped domains goes, this should be possible by computing the nonoverlapping triangulation of these points at startup. I left this feature out of the first proof of concept release for Simple because finding Delaunay triangulations scales very poorly with high dimensionality.
Simple is guaranteed to converge to the global optimum given enough samples. However, if the algorithm is spending too much time pursuing local optima, you can inform Simple to alter its search strategy by giving its exploration preference hyperparameter a slightly higher value. This will cause the algorithm to "hedge its bets" more, and place a greater emphasis on global exploration. It's possible to monotonically increase the value of this hyperparameter while optimization is already in progress without breaking anything, and future versions will support this.
Thanks for this, it appears to be nearly as effective as the bayesian optimization method I was using for my problem yet with much less overhead. I had a few "theoretical " questions before I start trying to look deeper and expand on this: