experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
188 stars 22 forks source link

DoE with NChooseK #341

Open jduerholt opened 7 months ago

jduerholt commented 7 months ago

Hi @Osburg,

I discovered a strange behavior when setting up the following DoE including NchooseKs:

from bofire.data_models.features.api import ContinuousInput
from bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint, NChooseKConstraint
from bofire.data_models.domain.api import Domain, Inputs, Constraints
from bofire.utils.doe import get_confounding_matrix
from bofire.data_models.strategies.api import  PolytopeSampler
import bofire.strategies.api as strategies
from bofire.strategies.doe.design import find_local_max_ipopt
import matplotlib.pyplot as plt

domain = Domain(
    inputs=Inputs(
        features = [
            ContinuousInput(key="i_1",bounds=(0,55)),
            ContinuousInput(key="i_2",bounds=(0,55)),
            ContinuousInput(key="c",bounds=(15,55)),
            ContinuousInput(key="a",bounds=(5,40)),
            ContinuousInput(key="p_1",bounds=(0,5)),
            ContinuousInput(key="p_2",bounds=(0,5)),
        ]),
        constraints = Constraints(
            constraints=[
                LinearEqualityConstraint(features=["i_1","i_2","c","a","p_1","p_2"], coefficients=[1,1,1,1,1,1,], rhs=100),
                LinearInequalityConstraint(features=["i_1", "i_2"], coefficients=[-1,-1], rhs=-15),
                LinearInequalityConstraint(features=["p_1", "p_2"], coefficients=[-1,-1], rhs=-0.5),
                LinearInequalityConstraint(features=["c","i_1", "i_2"], coefficients=[1,1,1], rhs=92),
                NChooseKConstraint(features=["i_1", "i_2"], min_count=0, max_count=1, none_also_valid=False),
                NChooseKConstraint(features=["p_1", "p_2"], min_count=0, max_count=1, none_also_valid=False),
            ]
        )
)

candidates = find_local_max_ipopt(domain,"linear")# sampling=sampling)

fig, axes = plt.subplots(ncols=4, figsize=(12, 4))
for i, combo in enumerate([["i_1", "p_1"],["i_2", "p_2"], ["i_1", "p_2"], ["i_2", "p_1"]]):
    axes[i].scatter(candidates[combo[0]], candidates[combo[1]])
    axes[i].set_xlabel(combo[0])
    axes[i].set_ylabel(combo[1])
plt.show()

What we are having here is a mixture design, in which we have two groups of two (i group with i_1 and i_2 and p group with p_1 and p_2). Out of each group always one ingredient has to be used. For this reason, I set the lower bound on the features to zero, and introduced per group a linear inequality constraint to have the proper lower bound and an NChooseK constraint.

If I now plot the results, one gets for example this:

image

One can see that for example the combination high i_1 and high p_1 is never tried, which is really strange because then one introduces an artifical correlation between this degrees of freedom.

In the following, I setup the same design but without the subgroups:

domain = Domain(
    inputs=Inputs(
        features = [
            ContinuousInput(key="i_1",bounds=(15,55)),
            ContinuousInput(key="c",bounds=(15,55)),
            ContinuousInput(key="a",bounds=(5,40)),
            ContinuousInput(key="p_1",bounds=(0.5,5)),
        ]),
        constraints = Constraints(
            constraints=[
                LinearEqualityConstraint(features=["i_1","c","a","p_1",], coefficients=[1,1,1,1], rhs=100),
                LinearInequalityConstraint(features=["c","i_1"], coefficients=[1,1], rhs=92),
            ]
        )
)

candidates = find_local_max_ipopt(domain,"linear")# sampling=sampling)

fig, ax = plt.subplots()
ax.scatter(candidates["i_1"], candidates["p_1"])
plt.show()

Here I get the following: image

One sees that when setting it up without NchooseKs it combines high values of both with each other.

Do you have any idea for this strange behavior?

One idea that I had was that it is affected by how the starting points for the ipopt are generated, currently we sample initial point by using the PolytopeSampler, which returns already candidates that obey NChooseK constraints. And afterwards, the DOE module put its bounds randomly on top, without taking into account on which positions the constraints are already fulfilled in the starting points. I think we should change it. What do you think? But providing initial samples by hand that do no obey to the NchooseKs, does not help either.

Would be really happy to get your input on this!

cc @dlinzner-bcs

jduerholt commented 7 months ago

Ok, I went deeper into the rabbit hole and I think, I find the problem. Something is wrong with nchoosek_constraints_as_bounds, which one sees when one executes the following:

bounds = nchoosek_constraints_as_bounds(domain, n_experiments=10)

for j in range(10):
    for i, k in enumerate(domain.inputs.get_keys()):
        b = bounds[j*6+i]
        if k in ["i_1", "i_2", "p_1", "p_2"]:
            if b != (0.0 , 0.0):
                print(k, b)
    print("------")

You will either have combinations i_2, p_2 / i_1, p_1 or i_1, p_2 / i_2, p_1 but you will never get the all four possible options. The question is why ...

Osburg commented 7 months ago

Hi @jduerholt I had a look into it and I think you are right: Initially the combinations of variables is set to zero. If the number of experiments exceeds the number of possible combinations it will just repeat the same order of combinations without reshuffling. Since both NChooseK constraints have the same number of possible combinations, the same set of bounds will be repeated over and over again. Reshuffling the combinations after each n_combinations fixes this. I will open a PR in a few mins. Thanks a lot for pointing out this bug!

Cheers, Aaron

Osburg commented 7 months ago

After the proposed changes running your code from above gives me stuff like this image I think this is more what it should look like, right?

jduerholt commented 7 months ago

Thank you very much @Osburg, I just merged it in.

But we should rething it in general, what happens?

We use the PolytopeSampler to generate a set of starting points. These starting points already obey the NChooseK constraints and all other linear constraints assigned. In the next step we randomly fix features to zero for the ipopt optimizer, but these features are different to the ones that are already zero due to the PolytopeSampler. From my perspective, we should fix for ipopt exactly those features that are already zero and are part of an NchooseK constraints.

What do you think?

jduerholt commented 7 months ago

The other option is to ignore the NChooseK constraints in the initial sampling, I do not know how it affects the performance of ipopt. For SLSQP, it is good to have all the constraints fulfilled for the starting points ... But what we do currently, is heavily inconsistent in my opinion.

Osburg commented 7 months ago

In principle I agree that the sampling and NChooseK constraint handling in doe are not quite harmonizing. But I am not sure if fixing the nonzero variables based on the initial guess actually is an improvement. I guess we cannot make sure that the nonzero variables of the initial guess follow a uniform (or some other desireable) distribution. Couldn't it happen that - in the most extreme case - all points in the initial set of experiments have the same nonzero variables? We would then be stuck with these in the entire optimization process. The advantage of the current approach is that we make sure that all possible combinations of nonzero variables will be populated somewhat uniformly (tbh i never even tried to do the math to check if a uniform distribution of the nonzero variables is optimal, but intuitively this seems to be a good idea). What do you think about this point? Do you see a way do avoid this problem?

jduerholt commented 7 months ago

Hi @Osburg,

Hmm, I have the impression that how I generate the samples that obey the NChooseK constraints follows the same distribution that you also use for assigning the fixed zeros:

https://github.com/experimental-design/bofire/blob/64f624082263a606fa226e3b494fc07c10b7d603/bofire/strategies/samplers/sampler.py#L55

Maybe you can just test it to see how it actually works, the edge case that you described above should not be possible. In general, I prefer to initialize an optimizer with samples that already fulfill the constrains (but this is just a stomach feeling).

After you have checked it out (only if you have time), we could then proceed with discussing the path forward, would this be okay for you?

Best,

Johannes