bmurauer / pipelinehelper

scikit-helper to hot-swap pipeline elements
GNU General Public License v3.0
21 stars 9 forks source link

Classifiers can be switched in grid search in a pipeline (the documentation says doesn't) #16

Closed singhkpratham closed 1 year ago

singhkpratham commented 2 years ago

The documentation says:

it is limited to transformers, so the last part of a pipeline (e.g., a classifier) can't be switched in this manner (at least I wasn't able to, please correct me if I'm wrong)

Classifiers can certainly be switched, for example:

import pandas as pd
from sklearn.pipeline import make_pipeline, Pipeline
from sklearn.svm import LinearSVC
from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier

iris = pd.read_csv("https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv")
pl = Pipeline([
    ('est',LinearSVC())
])
param_grid=[
    #{'est':[LinearSVC()]},
    {'est':[RandomForestClassifier()],'est__n_estimators':[5,10,25]},
    {'est':[DecisionTreeClassifier()]},

]
a = GridSearchCV(pl,param_grid)
a.fit(iris.drop(columns='species'),iris.species)
a.best_params_
  Output: {'est': RandomForestClassifier(n_estimators=5), 'est__n_estimators': 5}

Do correct me if I am misunderstanding.

bmurauer commented 2 years ago

thanks for sharing! While I still see some advantages in my approach in terms of clarity, it certainly seems to make this helper quite obsolete for most purposes :thinking: I will leave this ticket open until I have fixed the documentation.

bmurauer commented 1 year ago

I have updated the readme and will soon archive this project.