stacking and blending in REP - question

Hi all,

just to add to the request: I was actually wondering as well if you could implement that. There is already a package which basically does that, but does not support weights... https://github.com/dustinstansbury/stacked_generalization

What I though of is to create a meta-classifier just like the BaggingClassifier, KFoldClassifier etc. I would propose the behavior to be:

instance creation takes several (unfitted) classifiers as argument for the base classifiers as well as one stacking classifier stacking_clf = StackingClassifier(base_clf=[rdf_clf, xgb_clf1, xgb_clf2, nn_clf, ...], stacking_clf=logit_clf, ...)

fitting fits the base classifiers, lets them predict on the training data (in a normal fashion, not Kfolded; if one want's to have that, one could simply use KFoldClassifier(my_base_clf) as base classifiers) and train the stacking classifier on the base classifiers predictions.

prediction lets the base classifiers predict the data. The stacking classifier then uses these predictions to predict the final predictions.

possible options to add (instanciation?):
1) use one or several columns from the data also for the stacking classifier training 2) (copy and train each base classifier n times)

I think this would complete your repository to basically contain any (popular) meta-learning technique available so far. Using the same style as for bagging, kfolding etc allows for a perfect integration into your library. What do you think?

yandex / rep

stacking and blending in REP - question #99