rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.82k stars 853 forks source link

mlxtend sklearn preprocessing techniques #1068

Open israel-cj opened 9 months ago

israel-cj commented 9 months ago

Dear,

I hope you are doing great! Thank you for the work done in mlxtend. I have the next problem: Let’s say I have a list of pipelines called ‘get_pipelines’, where each pipeline contains preprocessing steps such as ColumnTransformer, SimpleImputer, etc. Each pipeline independently works when I want to fit/predict on my dataset. Nevertheless, if I do the stacking the preprocessing steps are not being considered since I get errors saying my data should be transformed from categorical to numerical, etc. when that is already done for each pipeline. Is there a way to ask the Stacking to use such preprocessing? My code looks like this (THANK YOU):

from mlxtend.classifier import StackingClassifier
from sklearn.linear_model import LogisticRegression

# Create a list of base models
base_models = [make_pipeline(model) for model in get_pipelines]

# Create the meta-model
meta_model = LogisticRegression()

# Create the stacked ensemble
stacked_ensemble = StackingClassifier(
    classifiers=base_models,
    meta_classifier=meta_model,
    use_probas=True,
    average_probas=False
)

# Train the stacked ensemble on the training data
stacked_ensemble.fit(X, y)
rasbt commented 9 months ago

Hi there,

You could try fit_base_estimators=False, i.e.,

stacked_ensemble = StackingClassifier(
    classifiers=base_models,
    meta_classifier=meta_model,
    use_probas=True,
    average_probas=False,
    fit_base_estimators=False
)