Stefan-Endres / shgo

Simplicial Homology Global Optimization
https://stefan-endres.github.io/shgo
MIT License
44 stars 12 forks source link

shgo stuck searching at x_min, x_max bounds #33

Open JeffR1992 opened 4 years ago

JeffR1992 commented 4 years ago

I'm trying to use shgo to tune the gains of a control algorithm in a black-box fashion, by iteratively running the controller in a simulation (each simulation only takes about 20 seconds to complete) and computing a cost from the simulation results, which can then be used as the objective function f for shgo. However, having left shgo running for the past few hours, I'm not seeing any progress being made. What I have noticed is the following:

1) shgo seems to be stuck switching the values in the optimization parameter vector x between the bounds x_min and x_max that were given to it, and doesn't seem to have done any searching within the bounds themselves.

2) In the first few minutes of running shgo, it iterates quickly through each simulation run, however, as time goes on it slows down substantially, spending minutes between each iteration. What is shgo doing in the minutes between simulation runs, is there a way to produce verbose output to investigate further?

Due to the strange results seen above, there could be two likely culprits that initially come to mind. The first is that my optimization vector x is of dimension 15, and after reading the shgo documentation it looks like anything above 10 dimensions will be a struggle for shgo to solve quickly (due to symmetries in the system dynamics, I can likely get this number below 10 though). The second potential problem is that the objective function (i.e. the cost output from the simulation) can return inf (or sys.float_info.max) if the controller causes the system that it is controlling to become unstable and have its states blow up, can shgo deal with objective functions that return inf?

Stefan-Endres commented 4 years ago

A 15 dimensional problem is too high for the current version of shgo (and deterministic black-box solvers in general). The solution would've been to switch to Sobol sampling (as the default is now triangulating the initial search space which requires solving the problem for 2^15 vertices which are on the x_min and x_max bounds), but the QHull library has difficulties computing the triangulation at such high dimensions. I would recommend using a stochastic algorithm (differential evolution, PSO) in your case depending, although with a 20 second objective function evaluation it will take a very long time to run (you have a very difficult optimisation problem in general(!)). Essentially these algorithms are more tractable for your problem, but at cost of losing the guarantee of finding the true global minimum.

There is a newer version of shgo for which a 15 dimensional problem with symmetries is solvable, but this is still a work in progress to merge the sampling libraries with this library in a stable commit.

The second potential problem is that the objective function (i.e. the cost output from the simulation) can return inf (or sys.float_info.max) if the controller causes the system that it is controlling to become unstable and have its states blow up, can shgo deal with objective functions that return inf?

This is not an issue at all for shgo, it is proven to converge when there are discontinuities, but a non-continuous optimisation problem of this size requires more computational resources (nan outputs, constraint violations and other errors are converted to np.inf objects anyway). The parallelization work on the algorithm is complete so this will be possible soon, but again implementing the new sampling library with shgo in a stable commit is a WIP. Unfortunately I can not give a reliable timeline on this work, so that is why I would recommend trying other algorithms in the mean time.

JeffR1992 commented 4 years ago

Thanks for the speedy reply. Having reduced the number of dimensions of the optimization vector x to 9, shgo is running much faster. However, I'm still unable to get an optimal or even sub-optimal solution out, and instead get the following message after shgo finishes:

fun: 88543901.75077078
 message: 'Failed to find a feasible minimiser point. Lowest sampling point = 88543901.75077078'
   nfev: 513
   nit: 2
   nlfev: 0
   nlhev: 0
   nljev: 0
 success: False
       x: array([0., 0., 0., 0., 0., 0., 0., 0., 0.])

Reading the shgo documentation, it mentioned convergence guarantees to a global optimum in finite time when using the default simplicial sampling method, so I'm quite surprised that a solution isn't able to be found. Is this simply due to reaching a maximum number of shgo iterations, or could it be something more sinister?

Stefan-Endres commented 4 years ago

Thanks for the speedy reply. Having reduced the number of dimensions of the optimization vector x to 9, shgo is running much faster. However, I'm still unable to get an optimal or even sub-optimal solution out, and instead get the following message after shgo finishes:

Yes, this is a normal termination, it is simply due to the specified iterations running out, you can increase the number of iterations by specifying the iters argument, however, refining past the initial triangulation requires a tremendous amount of computational resources.

The first thing I would do is to increase the number of sampling points in the first iteration, depending on the amount of computational resources that you have available. The snipped code below uses the 'sobol' sampling method as the current version of the simplicial method will try to refine the entire search space which is not tractable at 9 dimensions:

options = {'disp': True}

shgo(
    objective_function,
    bounds,
    sampling_method='sobol'
    n=10000,
    iters=1,  
    )

Reading the shgo documentation, it mentioned convergence guarantees to a global optimum in finite time when using the default simplicial sampling method, so I'm quite surprised that a solution isn't able to be found. Is this simply due to reaching a maximum number of shgo iterations, or could it be something more sinister?

To understand why these are the default settings note that by default black box optimisation problems have no stopping criteria (not even theoretical), therefore, without more information a black box optimization would always run forever in search of a better point (in literature toy problems the lowest objective function value is known). In practical problems usually the hyperparameters are experimented with (the n number of sampling points and more commonly the iters) similar to how the hyperparameter studies in stochastic algorithms are changed (such as differential evolution and particle swarm). Around 50000-300000 sampling points is ideal for multimodal 9D problems, however, since parallelization is not implement yet the limiting factor is the time you have to run the problem which is why I recommend trying 10000 first. The algorithm should run for approximately 4-6 days at 20 seconds per objective function evaluation (note that the local refinement step will take additional time to search the locally convex sub-domains for the minima). If the number of local minima from the result is low it is likely that you have found the global minimum, if not then the problem is very mutlimodal and more sampling is recommended.

In addition, if you are working with for example error functions where is known that the objective function is bounded by at least zero. You can add a minimum objective function value as follows:

options = {'f_min':1e-5,  # Replace 1e-5 or 0.0 with an error value that is "good enough" for your application
           'disp': True, 
           'minimize_every_iter': True }
optimize.shgo(
    self.objective_function,
    self.bounds,
    callback=self.callback_function_SHGO,
    sampling_method='sobol',
    options = options  
    )

Which should run indefinitely until a good enough point is found.

Note that in most black box literature studies there is also a tolerance of around 2% added to the known minima. In most applications the global minimum of a least square error function is not zero so this is not an overly helpful criteria.

The other main intended stopping criteria for shgo is minhgrd, which essentially tracks the local minima and stops the algorithm when no new ones are found (as in the paper this is the how the algorithm computes homologies of the hyper-surface and stops when the algorithm’s understanding of the geometry of the problem is confident enough to terminate), however, this is still highly experimental.

Both documentation and adequate beta testing are currently still lacking in this project.

Finally some additional considerations: