experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
213 stars 22 forks source link

Add generic typing to Constraints #386

Closed TobyBoyne closed 6 months ago

TobyBoyne commented 6 months ago

Following on from #383, here's generic typing in Constraints!

Pyright (and whatever type checker your IDE uses) will now infer the type of constraints contained in a Constraints object.

It isn't perfect - you can't type hint set subtraction [1], so the excludes keyword isn't typed properly. But it clears up more # type: ignores than before!

constraints = Constraints(constraints=[LinearEqualityConstraint(...), NChooseKConstraint(...)]))
# type of constraints.features is inferred as Sequence[LinearEqualityConstraint | NChooseKConstraint]
sub_constraints = constraints.get(includes=[NChooseKConstraint])
# type of sub_constraints.features is inferred as Sequence[NChooseKConstraint]

[1] https://discuss.python.org/t/type-intersection-and-negation-in-type-annotations/23879

jduerholt commented 6 months ago

Nice, thanks a lot @TobyBoyne! Can you perhaps elaborate a bit on what you actually did and how it works, as I do not understand why it works ;) So, I think it is better when @bertiqwerty reviews it. If it is fine for him, can you then also file a similar PR for Features, Inputs, and Outputs.

We should then also remove the following functions from domain as they are obsolete and the corresponding versions from domain.inputs, domain.outputs and domain.constraints should be used:

Or what do you think @bertiqwerty ?

TobyBoyne commented 6 months ago

Thanks @jduerholt!

Of course! Just to note, this does not change how users interface with bofire. The only way that users will interact with this is by seeing that their type hints have automatically become more informative! Also, this has already been implemented for Features, Inputs, and Outputs in #383.

So this PR uses Generic typing. You will have used generic type hints whenever you do something like

my_list: List[int]
x = my_list[0]
# the type of x is automatically inferred to be `int`

In this example, List accepts int as a TypeVar, which determines the type of the elements in the list.

I've done something similar here. You can now create a Constraints object that contains only one type of constraint, like Constraint[NChooseKConstraint].

However, you don't want to manually type hint the type of constraints each time. Going back to the previous example, we want something more like

my_list = [2, 4, 8]
x = my_list[0]
# the type of x is automatically inferred to be `int`, no explicit type hint required!

This is where TypeVar is useful. This allows is to infer the type of the elements in Constraint.constraints using generics. Whenever you make a new instance of a Constraint object, the TypeVar C stores the type of constraint passed to the constructor. This means that when you iterate through the constraints, your typechecker knows that the items you iterate over will be limited to the type of constraints that were passed to the input.

We can do even better. We use a second TypeVar, C_, for use in the get method. This allows us to keep track of the types of constraint passed to the includes argument, so that we can further restrict the type of constraint in the returned object. So now, when we have code like below:

constraints = Constraints(constraints=[...])
for constraint in constraints.get(includes=[NChooseKConstraint]):
    print(constraint.max_count)
   # Normally, pyright would raise an error here, since AnyConstraint.max_count isn't defined.
   # But with generic types, pyright can infer that the returned constraints from the `get` call
   #   must be of type NChooseKConstraint, so this attribute is always defined!

I hope this clears things up :) Let me know if you have any other questions!

bertiqwerty commented 6 months ago

We should then also remove the following functions from domain as they are obsolete and the corresponding versions from domain.inputs, domain.outputs and domain.constraints should be used:

* `domain.get_features`

* `domain.get_feature_keys`

* `get_feature`

Or what do you think @bertiqwerty ?

Agreed. In a separate PR.