mdabros / SharpLearning

Machine learning for C# .Net
MIT License
384 stars 85 forks source link

0.31.4.0: Add CrossValidationUtilities.GetKFoldCrossValidationIndexSets, Refactor CrossValidation. #117

Closed mdabros closed 5 years ago

mdabros commented 5 years ago

Extract the internal GetKFoldCrossValidationIndexSets method form the CrossValidation<T> class. This enables calculation of KFold CrossValidation IndexSets for use outside the CrossValidation<T> it self.

Usage:

// Targets to create KFold Index Sets from.
var targets = new double[] { 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3 };
// Sampler to control the sampling of the sets. In this case Stratified.
var sampler = new StratifiedIndexSampler<double>(seed: 242);

var indexSets = CrossValidationUtilities.GetKFoldCrossValidationIndexSets(sampler,
    foldCount: 4, targets: targets);

foreach (var (trainingIndices, validationIndices) in indexSets)
{
    // Do model training and accumulate predictions,
    // to form a fully k-fold cross validated prediction array.
}

Note, that in the case of remainders from samplesPerFold = targets.Length / foldCount, the last validationIndices will contain the remaining values (making it larger compared to the others), and the last trainingIndices will exclude these (making it smaller than the others).