Closed mschlupp closed 8 years ago
Good idea, thank you! Hope it will be pulled in soon.
Btw it seems somehow neglected... are you still active maintaining this repo? I personally find it very useful and make a lot use of it.
@mayou36
are you still active maintaining this repo?
rarely doing something. As the things we use work fine, there is no reason to spoil the situation :)
@mschlupp
Max, thanks for idea, but this patch isn't going to work (even if the code was correct). The reason is FoldingClassifier should not only work with training dataset, but also be able to predict any arbitrary dataset (which may have e.g. different size - this isn't going to work in your case).
So, it's more complicated. I'll think how the case of stratified folding can be added.
I don't quite understand why this should be an issue. You can predict arbitrary datasets and you can validate via CV. The only thing is that you substitute the KFold default by an folder of your choice. What am I missing here?
@mschlupp
There are several basic thing expected from FoldingClassifier:
At the same time, for fitting we have labels, but not for predicting. Currently this is done by fixing internal random state, which is later together with length used to correctly generate folds indices.
When you pass StratifiedKFold, you can't fulfill points 3&4, since you need different folding for new test dataset.
closed in favor of #92 (stratified folding)
If you would like to run rep with eg a StratifiedKFold instead of a normal KFold, this will be possible after the pull request. If no external folder-object is parsed, the default KFold algorithm is used.