scikit-optimize / scikit-optimize

Sequential model-based optimization with a `scipy.optimize` interface
https://scikit-optimize.github.io
BSD 3-Clause "New" or "Revised" License
2.74k stars 547 forks source link

Do a speed profile #413

Open MechCoder opened 7 years ago

MechCoder commented 7 years ago

We should do a speed profile, to identify the time taken in each part so as to see if we can speed up some obvious parts.

yngtodd commented 7 years ago

I was interested in this too when looking at gp_minimize for hyperparameter optimization.

Here are some first steps using rkern's line_profiler.

In that directory you will find

MechCoder commented 7 years ago

Thanks for your benchmarks!

It seems odd that this line (https://github.com/yngtodd/skopt_notes/blob/master/speed_test/profile_results.txt#L754) takes 23% of time, i.e 51 seconds in the method tell. Maybe that can be reduced?

betatim commented 7 years ago

What does _gaussian_acquisition use X for? Maybe we can use a linspace instead (which might be quicker). Then we only need to sample such a large number of points when using "sampling" as optimizer for the acquisition function. For BFGS we can sample only n_restarts_optimizer points which is usually much smaller than n_points.

MechCoder commented 7 years ago

The start points of lbfgs are obtained from X corresponding to the optimal acquisition function values from the random grid.

Apart from that, I'm unsure whether a uniform grid is better than a random grid

betatim commented 7 years ago

The starting points for BFGS are taken from X, but there are usually many less starting points then the size of X. So we could not taken them from X and only generate the much smaller number needed.

What I was wondering is why X is passed to _gaussian_acquisition.

To illustrate what I mean: #453