Closed bmreiniger closed 4 years ago
Yes, I'm already aware of this, thank you. I am already working on updating the examples and a new API for supporting out-of-fold predictions in the first level. Please refer to the discussion on Issue #13 for more details.
It looks like you use the same, entire, dataset for each step. But then the inputs to steps beyond the first layer are the "predictions" from models on their own training sets, which seems prone to data leakage and overfitting. See e.g. mlxtend's
StackingClassifier
vs.StackingCVClassifier
.