compatibility with sklearn `BaseEstimator`

sktime / skbase

Base classes for creating scikit-learn-like parametric objects, and tools for working with them.

BSD 3-Clause "New" or "Revised" License

18 stars 11 forks source link

compatibility with sklearn `BaseEstimator` #12

Open fkiraly opened 2 years ago

fkiraly commented 2 years ago

Question: how do we ensure compatibility with sklearn BaseEstimator and similar classes?

This might not be out of the box, if we check inheritance from BaseObject as opposed to availability of certain methods.

We could register a list of similar classes, but it would be better if this is out of the box compatible, e.g., get_params nesting and so on.

RNKuhns commented 2 years ago

@fkiraly this is a good thought. But I want to make sure I know the context your thinking of.Are you thinking in terms of an eventual check interface (e.g. check_is_object) or something else?

I'm in favor of duck typing that check but including a param to require a "strict" check (is also a sub class of BaseObject).

We might consider grouping the checks used in something like check_is_object by interface and having a parameter that controls what to check for (e.g. just get_params/set_params, tags, testing, etc). That would provide the most downstream comparability (sklearn baseobject for instance could pass) and provide users the means to test for just the interface required in their use case.

In terms of the interface implementation: I believe the get_params/set_params interface uses duck typing, so it should work with like objects. One option is to ensure that remains and make sure the other interfaces (testing and tags) do the same.

RNKuhns commented 2 years ago

Based on our call we also want to have tests that test the ability of estimators inheriting from BaseObject to pass scikit-learn's check_is_estimator.

To do this we need to look into sklearn's check_is_estimator to see the scope and format of sklearn's checks. If their checks are duck typed, a nice outcome is that estimators that inherit from BaseObject are able to pass sklearn's check_is_estimator directly. If the checks in sklearn's check_is_estimator are instead checking for direct inheritance from the sklearn BaseEstimator (in addition to checking the interface) we may need have tests that validate that you can construct an estimator object that inherits from sklearn's BaseEstimator and baseobject's BaseObject to get the BaseObject interface and pass sklearn's check_is_estimator.

fkiraly commented 2 years ago

An alternative would be to check the main interoperation points: get_params and set_params.

That would require building composites either way round and see whether the nested access and write still works.

fkiraly commented 2 years ago

From https://github.com/sktime/BaseObject/issues/17, another interoperation point is pretty printing.

fkiraly commented 2 years ago

We should have a list of interoperation points we need to test, actually.