sktime / mlaut

Other
24 stars 5 forks source link

Create proper models container #1

Closed ViktorKaz closed 6 years ago

ViktorKaz commented 6 years ago

ModelsContainer class should inherit from sklearn.base.BaseEstimator. We need ability to both specify own models and also let user choose from standard list of pre-defined models

fkiraly commented 6 years ago

no, that's not what I meant.

sklearn already has a class interface for a single modelling strategy.

It may be helpful to create one for mleap that contains multiple strategies, but such a class should use the sklearn BaseEstimator and descendants, since it would be a bad idea to re-create sklearn.

Alternatively, we may simply decide not to create a model container, though having functionality to create a standard bunch of models (which otherwise might be a standard method of such a container) might still be helpful.

frthjf commented 6 years ago

@ViktorKaz I agree, it does not make sense to introduce a model container like the BaseEstimator because scikit-learn and skpro already provides models in that form. You just want to orchestrate such existing models with your package, rather than providing another API to implement models. I think what you mean is something like the Model base class of skpro.workflow which stores the instance of the model (e.g. a subclass of a base estimator) and meta information like its hyperparamters as well as methods to hash, replicate, store the model instances. In our last meeting we briefly discussed whether we need such a dedicated "takes-care-of-the-model-instance" class or it is better keep it simple and just deal with the model instance (aka estimator) itself. I don't think we reached a conclusion for this yet, but you will probably see over time whether something like this is necessary.

ViktorKaz commented 6 years ago

Ok, thanks. I'll check Model base class of skpro.workflow

ViktorKaz commented 6 years ago

I decided not to create models container. Instead, I wrote several methods that facilitate the creation of estimators through the use of decorators. Please see mleap.data.estimators.

The user can either create the default list by calling:

from mleap.data.estimators import instantiate_default_estimators
estimators = instantiate_default_estimators()

which returns an array of: ['Estimator Name', estimator_instance]

estimator_instance needs to be a child of sklearn.base.BaseEstimator and have the standard .fit(), .predict(), etc. methods.

This array of estimators is one of the inputs of the Test Orchestrator class.

This approach gives flexibility to the user to either:

  1. Use the provided helper methods and decorator function or create its own array of estimators.

  2. Use the default list of estimators by calling mleap.data.estimators.instantiate_default_estimators

  3. Create its own array of estimators.

Let me know if you object to this approach.

ViktorKaz commented 6 years ago

As discussed, this approach is acceptable.