experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
222 stars 23 forks source link

DOE data model including objective #307

Open simonsung06 opened 1 year ago

simonsung06 commented 1 year ago

@Osburg @dlinzner-bcs

I'd like your opinion on being able to choose the objective when creating the DoEStrategy data model. I can imagine this would be useful for us internally since this would give more flexibility via the ask method. I have crudely got this working with an additional objective keyword argument:

class DoEStrategy(Strategy):
    type: Literal["DoEStrategy"] = "DoEStrategy"
    formula: Union[
        Literal[
            "linear",
            "linear-and-quadratic",
            "linear-and-interactions",
            "fully-quadratic",
        ],
        str,
    ]
    optimization_strategy: Literal[
        "default",
        "exhaustive",
        "branch-and-bound",
        "partially-random",
        "relaxed",
        "iterative",
    ] = "default"
    objective: OptimalityCriterionEnum = OptimalityCriterionEnum.D_OPTIMALITY

    verbose: bool = False

And also modifying the __init__ function of DoEStrategy when it is instantiated with the data model

class DoEStrategy(Strategy):
    def __init__(
        self,
        data_model: data_models.DoEStrategy,
        **kwargs,
    ):
        super().__init__(data_model=data_model, **kwargs)
        self.formula = data_model.formula
        self.data_model = data_model
        self.objective = data_model.objective
        self._partially_fixed_candidates = None
        self._fixed_candidates = None

And then passing self.objective into the "find_localmax" functions under _ask, e.g.:

design = find_local_max_ipopt(
                new_domain,
                self.formula,
                n_experiments=_candidate_count,
                fixed_experiments=None,
                partially_fixed_experiments=adapted_partially_fixed_candidates,
                objective=self.objective,
            )

I understand that this is a bit crude but is just a proof of principle. More would have to be considered with how to keep the code clean regarding formula in the data model since that becomes redundant with OptimalityCriterionEnum.SPACE_FILLING as the objective.

Just an idea for now and would love your opinion :)

Osburg commented 1 year ago

I am not the one in charge, but I think it makes sense to allow for choosing the objective. I think right now formula will always be used in 'find_local_max_ipopt()' to determine the number of experiments from the number of model terms. But since the number of experiments can also be given by the user this is not essential. You are totally right ofc that formula is actually not really necessary in case of OptimalityCriterionEnum.SPACE_FILLING as objective. Do you think we should make formula optional for find_local_max_ipopt() and Objective? Or what did you have in mind?

cheers, Aaron =)

simonsung06 commented 12 months ago

Hi Aaron,

I think that formula can be made optional for the find_local_max_ipopt() related functions but objective should become required instead of the default OptimalityCriterionEnum.D_OPTIMALITY. This is because it is always needed and has a profound effect on the outcome and therefore should be chosen by the user. What do you think?

Related to the data models associated with DOE's, I would like to propose the following:

We could consider creating data models for Objectives that are used in the DoE's. They would be named as DoEObjective to avoid confusion with the objectives that are used as part of the Bayesian optimization side of BoFire. Whether Domain should be provided now or in sub-classes is up for debate. This could be used as the base DoEObjective class:

from bofire.data_models.base import BaseModel
from bofire.data_models.domain.api import Domain

class DoEObjective(BaseModel)
    domain: Domain

The actual DoE objectives would then inherit from this. The formula could be provided for OptimalObjective since it is used for this type of objective. If an OptimalObjective is provided to find_local_max_ipopt(), then the number of experiments can be determined from the number of model terms or n_experiments like it is currently. If a MaximinObjective is provided in find_local_max_ipopt(), then n_experiments should also be provided by the user, otherwise an Exception is raised. These new additions would mean that OptimalityCriterionEnum would no longer be needed.

class OptimalObjective(DoEObjective):
    formula: str,
    n_experiments: int,
    delta: float

class DOptimality(OptimalObjective)
class AOptimality(OptimalObjective)
class GOptimality(OptimalObjective)
class EOptimality(OptimalObjective)
class KOptimality(OptimalObjective)

# Renamed from SpaceFilling because there could in theory be other methods for space filling
class MaximinObjective(DoEObjective)

And then finally the DoEStrategy data model could become the following:

from bofire.data_models.strategies.strategy import Strategy

DoEStrategy(Strategy,
    type: Literal["DoEStrategy"] = "DoEStrategy",
    objective: DoEObjective, 
    optimization_strategy: Literal["default", "exhaustive", "branch-and-bound", "partially-random", "relaxed", "iterative"] = "default"
    verbose: bool = False
    )

Once again, these are just ideas at this stage but i think elaborating on the data models regarding the DoE methods could be helpful in the long-term. Would love your opinion on this. Perhaps also @jduerholt 's since he usually has good ideas ;)

jduerholt commented 11 months ago

@simonsung06 @Osburg: Thanks for your discussion here. I will further think about it and try to come up with a solution, but not this week :D