experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
188 stars 22 forks source link

Scaling in DoE #343

Closed jduerholt closed 6 days ago

jduerholt commented 7 months ago

Hi @Osburg,

I continued playing with the DoE Module, and one thing caught my attention. I wanted to have a fully-quadratic model in a D-optimal design and was for this reason expecting for every degree of freedom also values that are somehow in the middle between the lower and upper bound and not only at the bounds. Within this setup, I had one degree of freedom in which the range between lower and upper bound was much smaller than for the others , in this case it was proposing points that were just at the bounds and nothing in between.

Then, I increased the range of the bounds to values comparable to the others, in this case, I got also values close to the centerpoint as expected and intended. I think the reason for this (for me undesired behavior) is due to the fact that the values with the degree of freedom with the much smaller range has for numerical reasons less impact of the Dvalue than the other degrees of freedom.

From my point of view, every degree of freedom should have the same impact, so we should scale here https://github.com/experimental-design/bofire/blob/64f624082263a606fa226e3b494fc07c10b7d603/bofire/strategies/doe/objective.py#L78 all degrees of freedom to unit range or ([-1,1] as also often as convention in classical DoEs) before sending it into the model. What do you think? The actual degrees of freedom for ipopt would be still the unscaled to be able to adhere to the constraints.

If you want, I can generate an example for you tmr.

Best,

Johannes

cc: @dlinzner-bcs

Osburg commented 7 months ago

Hi @jduerholt :)

I am not sure if I completely understand: Does this happen because of numerical reasons in a sense that in certain cases a larger/smaller range of some variables leads to computational errors? Then i totally agree with you that we should do sth. about this. Or is it an analytical problem that w.r.t. the D-criterion the undesireable designs are actually optimal? We could still do it then, especially if you say that in practice you would scale your problem anyways. But what would we do in case sb really wants a D-optimal design for the unscaled problem? We could just add an option for scaling the variable domains and set this to true by default i guess. What do you think about this? Or is it so common to scale your problem that this option will most likely never be used?

A little bit off-topic and possibly a stupid idea (it is late at night already :D), but if single criteria (like the D-criterion) sometimes lead to designs that have some undesireable properties: do you think it makes sense to add the possibility to combine multiple criteria (ideally, one could avoid undesired designs by adding cost function terms that counteract the undesired characteristics)? At least this should be easy to implement on top of the existing objectives.

Cheers, Aaron

jduerholt commented 7 months ago

Hi @Osburg,

as a starting point for the discussion, here is an MWE to reproduce the behavior:

from bofire.data_models.features.api import ContinuousInput
from bofire.data_models.constraints.api import LinearEqualityConstraint, LinearInequalityConstraint, NChooseKConstraint
from bofire.data_models.domain.api import Domain, Inputs, Constraints
from bofire.utils.doe import get_confounding_matrix
from bofire.data_models.strategies.api import  PolytopeSampler
import bofire.strategies.api as strategies
from bofire.strategies.doe.design import find_local_max_ipopt
import matplotlib.pyplot as plt
from bofire.strategies.doe.objective import DOptimality
from bofire.strategies.doe.utils import get_formula_from_string, nchoosek_constraints_as_bounds
from bofire.strategies.enum import OptimalityCriterionEnum

domain = Domain(
    inputs=Inputs(
        features = [
            ContinuousInput(key="i_1",bounds=(0,55)),
            ContinuousInput(key="i_2",bounds=(0,55)),
            ContinuousInput(key="c",bounds=(15,55)),
            ContinuousInput(key="a",bounds=(5,40)),
            ContinuousInput(key="p_1",bounds=(0,5)),
            ContinuousInput(key="p_2",bounds=(0,5)),
            ContinuousInput(key="b",bounds=(0.006,0.01)),
        ]),
        constraints = Constraints(
            constraints=[
                LinearEqualityConstraint(features=["i_1","i_2","c","a","p_1","p_2"], coefficients=[1,1,1,1,1,1,], rhs=100),
                LinearInequalityConstraint(features=["i_1", "i_2"], coefficients=[-1,-1], rhs=-15),
                LinearInequalityConstraint(features=["p_1", "p_2"], coefficients=[-1,-1], rhs=-0.5),
                LinearInequalityConstraint(features=["c","i_1", "i_2"], coefficients=[1,1,1], rhs=92),
                NChooseKConstraint(features=["i_1", "i_2"], min_count=0, max_count=1, none_also_valid=False),
                NChooseKConstraint(features=["p_1", "p_2"], min_count=0, max_count=1, none_also_valid=False),           
            ]
        )
)

candidates = find_local_max_ipopt(domain,"fully-quadratic", objective=OptimalityCriterionEnum.D_OPTIMALITY)

candidates["b"].hist()
plt.show()

domain = Domain(
    inputs=Inputs(
        features = [
            ContinuousInput(key="i_1",bounds=(0,55)),
            ContinuousInput(key="i_2",bounds=(0,55)),
            ContinuousInput(key="c",bounds=(15,55)),
            ContinuousInput(key="a",bounds=(5,40)),
            ContinuousInput(key="p_1",bounds=(0,5)),
            ContinuousInput(key="p_2",bounds=(0,5)),
            ContinuousInput(key="b",bounds=(0.00,1.00)),
        ]),
        constraints = Constraints(
            constraints=[
                LinearEqualityConstraint(features=["i_1","i_2","c","a","p_1","p_2"], coefficients=[1,1,1,1,1,1,], rhs=100),
                LinearInequalityConstraint(features=["i_1", "i_2"], coefficients=[-1,-1], rhs=-15),
                LinearInequalityConstraint(features=["p_1", "p_2"], coefficients=[-1,-1], rhs=-0.5),
                LinearInequalityConstraint(features=["c","i_1", "i_2"], coefficients=[1,1,1], rhs=92),
                NChooseKConstraint(features=["i_1", "i_2"], min_count=0, max_count=1, none_also_valid=False),
                NChooseKConstraint(features=["p_1", "p_2"], min_count=0, max_count=1, none_also_valid=False),           
            ]
        )
)

candidates = find_local_max_ipopt(domain,"fully-quadratic", objective=OptimalityCriterionEnum.D_OPTIMALITY)

candidates["b"].hist()
plt.show()

With the small range for b, one gets this bimodal distribution:

image

With the larger range, one gets this one, which is much more in the direction of my expectation:

image

From my impression, this is not due to numerical erros, but more the second case. This lies in the nature of the D value. From my perspective, one should always treat all features on the same footing, especially when one thinks from the perspective of a linear model.

I will try to adhoc implement the scaling and create a WIP PR. Then we can further discuss it. I also like the idea to make the scaling optional.

Best,

Johannes

jduerholt commented 6 months ago

Hi @Osburg,

I did some reverse engineering on how Minitab is doing it, they are actually scaling between -1 and 1, I think we should also give this option at some point, but my simple implementation is currently still creating problem. Perhaps we can have a look at some point together.

Viele Grüße,

Johannes

Osburg commented 6 months ago

Hey @jduerholt,

sry for the inactivity, very busy with uni rn. The PR i just opened is a draft that does the variable scaling. Linear constraints can be transformed easily, but in case of nonlinear constraints the solution is a little more messy and just replaces the variables before transformation as a function of the transformed variables in the constraint expression. What do you think about this? Do you see a more elegant way to solve this?

I only did a few quick tests with the problems from basic_examples.ipynb, in general the approach seemed to work. Some stuff (user defined initial guess, fixed experiments etc) is not implemented yet. Will do this if you are happy with the general structure.

Cheers :) Aaron

jduerholt commented 6 days ago

Closed with https://github.com/experimental-design/bofire/pull/358.