scikit-learn / scikit-learn

scikit-learn: machine learning in Python
https://scikit-learn.org
BSD 3-Clause "New" or "Revised" License
59.51k stars 25.28k forks source link

Default behavior for KFold #3111

Closed wgyn closed 10 years ago

wgyn commented 10 years ago

Is there a rationale for having the default behavior of KFold be to return folds without random shuffling? This kind of threw me off...

See: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/cross_validation.py#L307

I think there's benefit to having shuffle=True be the default and letting users opt out of shuffling, as I think that's the natural expectation.

Relatedly, is there a rationale for not having a shuffle option for StratifiedKFold? (If not, I can open up a PR).

jnothman commented 10 years ago

You can see this recent discussion http://comments.gmane.org/gmane.comp.python.scikit-learn/10349 but it's a bit lengthy. In short, yes, there is a rationale; and #3110 introduces a ShuffleKFold to make this clearer.