experimental-design / bofire

Experimental design and (multi-objective) bayesian optimization.
https://experimental-design.github.io/bofire/
BSD 3-Clause "New" or "Revised" License
232 stars 23 forks source link

Feature type hints #396

Closed bertiqwerty closed 6 months ago

bertiqwerty commented 6 months ago

I added some Optionals that where missing. I also am confused about includes and excludes and added some tests.

Now I understand how includes/excludes work but I am still confused why it is an option to pass both, includes and excludes. It complains about overlaps anyways. Further, if I pass just excludes it fails with overlap. I think the default of includes should be None and in case both are None we should just skip the filter_by_class part and return the whole thing. But probably I have missed something.

bertiqwerty commented 6 months ago

After a little more investigation I found the reason why you can pass both to be not None. Includes and excludes are expanded in case they are unions like AnyFeature. In this case it can happen that the same class (not sub-class) is included and excluded and the overlap complain appears.

But it still makes sense to exclude subclasses of inputs. I find it neither simple nor elegant but still rather confusing. Further, I think probably many use cases can represent their filter differently. E.g., if you just want CategoricalInputs but not CategoricalDescriptorInputs you can use exact. If you want CategoricalInputs and CategoricalDescriptorInputs but not MolecularDescriptorInput you can use include and exact again. On the other hand, if you want all NumericalInputs and CategoricalInputs except CategoricalDescriptorInputs it might be tedious with exact. But who wants that? Or if you want that you can still use the filter_by_class-function directly.

Not sure how to proceed.

  1. It would be a breaking API change to allow only one to be !=None. Not sure if any BoFire users rely on this.
  2. What is the purpose of the overlap-exception? Maybe it was intended to help find mis-specifications of the user. You could pass (includes=ContinuousInput, excludes=ContinuousInput) which looks like a user error. Maybe we should check for overlap before we expand the Unions?

What do you think @jduerholt ?

bertiqwerty commented 6 months ago

In commit c5aa866 I kept the both-can-be-something behavior but moved the overlap check. Now you can pass AnyFeature and still exclude stuff which should not break the API. As mentioned above I would prefer the only-one-can-be-something behavior if this is acceptable but we can also leave it like this.