Closed lendle closed 10 years ago
Isn't that fixed by https://github.com/JuliaStats/MLBase.jl/pull/4?
According to the readme, the CrossValGenerators
should return "indices of samples selected for training", which is not what Kfold
was doing. (Unless you wanted to train on 1/k
of the observations and test on 1-1/k
of the observations, but that's not what people usually mean by K-fold CV in my experience.)
cross_validation
function gen
and train the others, but that would not be correct for the other CrossValGenerators
. For example for LOOCV
, that would train on one obs at a time and test on the remaining n-1
.Disclaimer: I've stared at this enough that there's a good chance I've confused myself and this was correct to begin with. If so, sorry!
Right. It works both -- I don't care. However, it should be consistent and thus the README has to be corrected anyways, because the example below that description suggests the opposite :smile:
Oops forgot about the README. Is it more clear now?
Thanks!
The
Kfold
iterator was returning the validation set indexes instead of the training set indexes, which was inconsistent with the otherCrossValGenerator
.Commit 60de1f1 fixed the
cross_validate
function for the brokenKfold
iterator but broke it for the otherCrossValGenerators
.This pull request fixes the
Kfold
iterator and reverts 60de1f1.