rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.86k stars 857 forks source link

how to stacking with Keras model #755

Open oras903 opened 3 years ago

oras903 commented 3 years ago

Describe the workflow you want to enable

I want to use StackingCVClassifier to stack sklearn model and keras model , and I have done some work , but when I try sklearn's cross_val_score method , I get " can't pickle _thread.RLock objects" error message.

Describe your proposed solution

I found the sklearn's cross_val_score / gridsearchcv method use clone() or someother method that not support Keras model

Is there a way that adapt Keras model to sklearn structure and benefit from StackingCVClassifier ? especialy GridSearchCV , thanks

Describe alternatives you've considered, if relevant

Additional context

qiagu commented 3 years ago

@oras903 You may want to have a look at https://github.com/goeckslab/Galaxy-ML. Keras wrapper was reimplemented for better compatibility with sklearn APIs. No guarantee it works. Please let me know what you think.

rasbt commented 3 years ago

Haven't used it myself, but year, this looks like it could get the job done:

from galaxy_ml.keras_galaxy_models import KerasGClassifier

# build a DNN classifier
model = Sequential()
model.add(Dense(64))
model.add(Activation(‘relu'))
model.add((Dense(1, activation=‘sigmoid’)))
config = model.get_config()

classifier = KerasGClassifier(config, random_state=42)
...

from their Readme, it also looks like they support the stacking classes in MLxtend explicitly, too. Maybe adding this to the stacking models' documentations would be nice.

qiagu commented 3 years ago

Yes. Galaxy-ML supports stacking classes in MLxtend well. A demo tool can be accessible at https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/bgruening/sklearn_stacking_ensemble_models/sklearn_stacking_ensemble_models/1.0.8.2 , where MLxtend classes are grouped under the Choose the stacking ensemble type. Together with other tools, like Pipeline Builder and Hyperparameter Search, decent machine learning work can be done on the web.

Tell a story here. We invested a lot time in making complex modeling, like stacking, easier in Galaxy-ML tools. I believe we did good job so far. However, when practicing in the biology field, complex modeling doesn't always justify its benefit. Biology people often prefer a simple model with better interpretability.