Scikit-learn has some strict rules about how models are structured:
all arguments explicitly listed in the init signature (no varargs)
has expected functions (fit, predict, etc.)
implements the get_params and set_params functions defined by BaseEstimator
Even if we inherit from a scikit-learn model, if these rules aren't followed than we aren't able to take advantage of scikit-learn utilities such as GridSearchCV optimisation, super-learner ensembles and some of the introspection/reflection functionality of models. You'll get an error along the lines of models must explcitily declare their parameters in init (no var args).
Sudipta got started on restructuring models to be compatible with GridSearchCV. These can be found in uncoverml.optimise.models. It requires tweaking to the mixins and for all parameters to be defined in the init. We should do the same for all models in uncoverml.models and then unify them all in uncoverml.models so we have a single models module. By following the work Sudipta has done it should be pretty straightforward (albeit time consuming) to complete this for all models.
The advantage is we no longer have the confusion of some models being duplicated and we can use all models with optimisation, superlearner ensembles etc.
Scikit-learn has some strict rules about how models are structured:
Even if we inherit from a scikit-learn model, if these rules aren't followed than we aren't able to take advantage of scikit-learn utilities such as GridSearchCV optimisation, super-learner ensembles and some of the introspection/reflection functionality of models. You'll get an error along the lines of models must explcitily declare their parameters in init (no var args).
Sudipta got started on restructuring models to be compatible with GridSearchCV. These can be found in
uncoverml.optimise.models
. It requires tweaking to the mixins and for all parameters to be defined in the init. We should do the same for all models inuncoverml.models
and then unify them all inuncoverml.models
so we have a single models module. By following the work Sudipta has done it should be pretty straightforward (albeit time consuming) to complete this for all models.The advantage is we no longer have the confusion of some models being duplicated and we can use all models with optimisation, superlearner ensembles etc.