Closed leandrobbraga closed 1 year ago
Hi @leandrobbraga,
thanks for your interest in this package and scouring the code! For context, I authored the PR that added the constrained optimization.
The y_max
that is being used here is the maximum value of the target function for parameters that fulfil the constraints. Unless your constraint function is non-deterministic (sidenote -- I cannot think of a case where this would be sensible. I don't think a constraint that is fulfilled randomly makes sense) and you have modelled this by using a GP with an additional WhiteKernel
to account for noise, the result of constraint.predict
for this point should always be p(y_max)=1
, since the GP knows that the point fulfils the constraint of this point. In this sense, the constraint fulfilment probability is accounted for through y_max/p(y_max) = y_max
.
I've thought about it, and I think are correct and this is not working as it should. By all means, feel free to make a PR and tag me, I will gladly review it.
Thank you for the time of replying, I'll prepare a PR soon.
About the first point,
1
The
y_max
that is being used here is the maximum value of the target function for parameters that fulfil the constraints. Unless your constraint function is non-deterministic (sidenote -- I cannot think of a case where this would be sensible. I don't think a constraint that is fulfilled randomly makes sense) and you have modelled this by using a GP with an additionalWhiteKernel
to account for noise, the result ofconstraint.predict
for this point should always bep(y_max)=1
, since the GP knows that the point fulfils the constraint of this point. In this sense, the constraint fulfilment probability is accounted for throughy_max/p(y_max) = y_max
.
When you look in this snippet
# Finding argmax of the acquisition function.
suggestion = acq_max(ac=utility_function.utility,
gp=self._gp,
constraint=self.constraint,
y_max=self._space.target.max(),
bounds=self._space.bounds,
random_state=self._random_state)
The self._space.target.max()
is a simple ndarray
containing all the y, feasible and unfeasible. My understanding is that the correct would be passing the max over only the feasible "ys", so the expected improvement can be calculated over the best feasible result so far. Otherwise we'll try to improve over "y" that are not feasible and possible much higher value than the feasible "ys", penalizing the region of feasible "ys". I noticed that this is the case after adding debug prints in this "y_max" inside the code and It was indeed an unfeasible "y".
In the other hand we have the self.space.max()
. Which does respect the feasibility. There is a section of code specific for the constrained case, which go over the array of targets gathering only the ys that fulfill the constraint before picking the highest value.
def max(self):
"""Get maximum target value found and corresponding parameters.
If there is a constraint present, the maximum value that fulfills the
constraint is returned."""
if self._constraint is None:
try:
res = {
'target': self.target.max(),
'params': dict(
zip(self.keys, self.params[self.target.argmax()])
)
}
except ValueError:
res = {}
return res
else:
allowed = self._constraint.allowed(self._constraint_values)
if allowed.any():
# Getting of all points that fulfill the constraints, find the
# one with the maximum value for the target function.
sorted = np.argsort(self.target)
idx = sorted[allowed[sorted]][-1]
# there must be a better way to do this, right?
res = {
'target': self.target[idx],
'params': dict(
zip(self.keys, self.params[idx])
),
'constraint': self._constraint_values[idx]
}
else:
res = {
'target': None,
'params': None,
'constraint': None
}
return res
Notice that in the unconstrained branch it uses the self.target.max()
exactly as in the acq_max
call.
Does it make sense? Or Am I misunderstand something?
Wow, good catch! I must've missed that. Sorry for the confusion and thanks for spotting this!
Ok! You can assign this issue to me! I'll author a MR for those issues!
Hello,
I was looking into the constrained optimization code path and noticed a few things and I would like to ask some questions to see if my understanding is correct or not.
This is a snippet from the "acq_max" function:
The first thing that I noticed was that the
y_max
that is passed here is the "unconstrained y_max" and not the "constrained y_max" and in the original paper I can cite this:The second thing is that there is a "warm up" phase with cheap evaluations, this phase is not considering the
p_constraint
at all, which mean that they will get highery_max
values than normal (the same as p_constraint = 1).Is this a bug? Does this make sense? I'm no expert in the area.
If it makes sense, I can help with a PR here.
Thank you!