Getting expected minimum from `.plot_objective()`?

Is it possible to get the expected minimum numerically from .plot_objective() rather than .expected_minimum()?

The following code gives me a different expected minimum than what the graph shows:

[...]
ProcessOptimizer.plot_objective(result=result, pars="expected_minimum", dimensions=["...", "..."])
expected_minimum = ProcessOptimizer.expected_minimum(result, return_std=True, minmax="min")
print(expected_minimum)

Problem

print(result.x) (SciPy OptimizeResult object) also gives me something a little bit different, but close. In this case it gave me [29, 30000.0].

This is another problem but I get entirely different results and plots when re-executing .plot_objective() and .expected_minimum(). Random seeds or setting random_state from .expected_minimum() to something other than None will surely help but I didn't expect the results to vary this strongly. Maybe I have to tune some hyperparameters and there's some problem with the optimization.

Thanks in advance!

Hi @SandorAlbert,

Thank you very much for bringing this up. It is a very interesting observation. Expected minimum is working by doing n random samples in the search space. For each of these n random starting points a "minimize" (from scipy.optimize) is searching for local minima by repeated sampling of the surrogatemodel. The best of these 20 local minima is presented to user. Is this the optimal way: No - it has no garantee of globally finding the optima (min or max). But it's the pragmatic approach to avoid super-expensive searches of the surrogate model (which is, after all, just a model).

So back to your issue. My intuitions suggests that, in a very flat area of the X to y mapping, very little differences in the starting point of these n random points might have big implications. For sure, I'll go back and look further into the case, but for starters, I would be curious on the behavior if you increase the "n". This is done with the keyword "n_random_starts" in expected_minimum() and with "expected_minimum_samples" in the plotting call. (I know it's a it clunky). Perhaps even try to increase it dramatically. The current preset value (20) is chosen to balance speed and precision in functions with "more-of-a-valley-looking-minimum" instead of your case which looks like a "large-flat-river-bed-type-of-minimum".

Additionally (but not as an explanation to your issue), remember that the "of-the-bat" visualizations show the model uncertainty, only. For fun, you can try to add the observational noise by using the "add_observational_noise()" function in the optimzer class before calling the result object and plotting. (you can remove the observational noise again on the optimizer object with "remove_observational_noise()".

Very many words. Sorry.

tl/dr If you have a large, flat minimum, then sampling behaves more random. Resulting in tiny differences in the y-value being reported as "important". A very large area of your space will return y-values very close to 0 and I do not believe that the model can distinguish between them meaningfully. Add random starts to the call of expected_minimum.

novonordisk-research / ProcessOptimizer

Getting expected minimum from `.plot_objective()`? #280