MLBazaar / MLBlocks

A library for composing end-to-end tunable machine learning pipelines.
https://mlbazaar.github.io/MLBlocks
MIT License
114 stars 35 forks source link

Enabling passing of primitives/pipelines to hyperparameters #126

Open dhruvsharma1992 opened 4 years ago

dhruvsharma1992 commented 4 years ago

Description

I was trying to construct a primitive for sklearn.model_selection.RandomizedSearchCV when I found that the hyperparameter for estimator is itself can be a primitive.

What I Did

To go around the current implementation, I tried to create a pipeline for the estimator (using LogisticRegression) and passed that as the value for the hyperparameter estimator i input_params as follows: init_params = { "sklearn.model_selection.RandomizedSearchCV": { 'estimator': logistic_regression_pipeline, 'scoring': "accuracy", 'n_iter': 5 } }

In python <3.7 it fails in deepcopy of hyperparameters (fix suggested here: https://stackoverflow.com/questions/6279305/typeerror-cannot-deepcopy-this-pattern-object)

In python 3.7+ it fails where sklearn throws the exception that the estimator object needs to be an object of type sklearn estimator and not MLPipeline

There needs to be a way to pass such primitives as input params

Note It works if I pass the logisticRegression object directly to the init_params as: 'estimator': LogisticRegression(random_state=0)

but looses the capability of: 1) saving the pipeline to disk 2) constructing the complete pipeline using only MLBlocks