SheffieldML / GPyOpt

Gaussian Process Optimization using GPy
BSD 3-Clause "New" or "Revised" License
928 stars 261 forks source link

Optimization with constraints, LCB and CMA #252

Open AnthonyLarroque opened 5 years ago

AnthonyLarroque commented 5 years ago

Dear GpyOpt developers,

I am currently using the GPyOpt version 1.2.5 for external function evaluations and would like to compare the influence of the acquisition function, optimizer and kernel on the results that I have.

However, when using constraints and the LCB acquisition function, the constraints that I imposed (abs(x[:,0]+x[:,1]+x[:,2]) - 0.1) for a design space in 3 dimensions (every design parameter included in (-0.1, 0.1)) are not respected for this acquisition function whereas they are for EI or MPI for example. Would you have any idea why ?

Also when I try to use the CMA optimizer with the constraints mentioned, I have the following error message: ValueError: Initial standard deviation s (sigma0*stds) are larger than the bounded domain size in variable [0 1 2]. Consider using option 'CMA_stds', if the bounded domain sizes differ significantly.

Could you help me with this error as well ?

I am finally taking advantage of this message to ask something else: I noticed that at the end of the optimization process, for the EI and MPI acquisition functions, the function evaluation was performed at, what I believe another point of the maximum acquisition function (as can be seen here: https://github.com/SheffieldML/GPyOpt/issues/207).

So, is it due to the difficulty of the optimizer to find the optimum of the acquisition function in the end of the process due to the shape of the EI acquisition function ? I read that Bayesian optimization was supposed to converge only if the maximum of the acquisition function was found. So, is there a way to be sure that the algorithm found the maximum of the acquisition function in dimensions superior to 2 since we do not have access to the plot of the acquisition function for these dimensions ?

Kind regards, Anthony Larroque

apaleyes commented 5 years ago

Hi @AnthonyLarroque

I am afraid I cannot answer your questions fully, but here goes:

  1. For constraints issue, we will need to see the code for the way you use LCB and EI, it isn't obvious why is this happening.

  2. For CMA optimizer, this error is coming from cma package. But to be fair, we hardly have anybody using this optimizer these days, so this code path might be buggy. Please refer to cma package for more details on this error: https://github.com/CMA-ES/pycma

  3. Again, can you post a picture/example to make things clearer? The issue you linked has a decent explanation. You can also read up this answer on SO

AnthonyLarroque commented 5 years ago

Hi @apaleyes

Thank you very much for your answer and I am sorry for the slow reply. I have been busy ultimately with some deadlines.

  1. I investigated more the problem and this is also a problem that I can observe through external function evaluations with GPyOpt test functions such as the sixhumpcamel in 2D. As you can see on LCB0_0, the constraints that I set ('x[:,1] -.5') are not respected whereas they are on the next iterations (LCB0_1).

LCB0_0 LCB0_1

And sometimes, they are respected during the whole optimization process (such as in LCB2_0 and LCB2_1).

LCB2_0 LCB2_1

I am also sending you the program that I use in .txt (LCB_sixhumpcamel.txt) so you can have a look. Until now, I have not seen this problem with the EI and MPI acquisition functions. LCB_sixhumpcamel.txt

  1. Alright, is lbfgs considered as a better choice to optimize an acquisition function than CMA ? May lbfgs be trapped in local optima of the acquisition function ?

  2. I am not really talking about what is explained on these issues. You can have an explanation of what I am talking about, still with the sixhumpcamel example, on the pictures EI0_45 and EI0_46.

EI0_45 EI0_46

On the first picture the maximum of the EI acquisition function seems to be located at approximately (0.1, -0.75). However, the black dot on the same figure (which if I am not wrong represent the maximum of the acquisition function encountered by the optimization algorithm ?) is located at roughly (0.5, 0.5). Finally, on the next iteration, the point is evaluated around (0.1, -0.4). If the maximum of the acquisition function was encountered, all these points would be the same, would not they ? This is at least what I observed with the LCB acquisition function where are the end of the optimization, all these 3 points were the same such as in LCB0_45 and LCB0_46.

LCB0_45 LCB0_46

So my questions are: is the optimization algorithm always able to find the optimum of the EI and MPI acquisition functions ? If not, is it due to the singular shape of the EI and MPI acquisition functions at the end of the process ? Would there be a way or a hint to control that the optimum of the acquisition function was found when we do not have access to the plot of the acquisition function such as in high dimensions ?

Kind regards, Anthony Larroque

AnthonyLarroque commented 5 years ago

Hello,

I investigated a little bit more on the error number 2, about the CMA optimizer.

It seems that the error comes from the line 126 of optimizer.py.

Indeed, the sigma0 was fixed at 0.6 whereas the parameters of my domain were included in (-0.1, 0.1). According to the docstring of cma.fmin, it appears that this parameter should be 1/4 th of the search domain. So I changed it for the moment for np.mean(uB-lB)/4 and it seems to work. However, I am not sure that np.mean it is the best choice, especially when the domains of the design parameters do not have the same bounds.

Would you have any answers for my others questions ?

Kind regards, Anthony Larroque