erelsgl / limdu

Machine-learning for Node.js
GNU Lesser General Public License v3.0
1.05k stars 99 forks source link

Confusion in README #65

Open Berkmann18 opened 5 years ago

Berkmann18 commented 5 years ago

I've noticed that the cross-validation example uses macroAverage:

var macroAverage = new limdu.utils.PrecisionRecall();

limdu.utils.partitions.partitions(dataset, numOfFolds, function(trainSet, testSet) {
    console.log("Training on "+trainSet.length+" samples, testing on "+testSet.length+" samples");
    var classifier = new IntentClassifier();
    classifier.trainBatch(trainSet);
    limdu.utils.test(classifier, testSet, /* verbosity = */0,
        microAverage, macroAverage);
});

macroAverage.calculateMacroAverageStats(numOfFolds);
console.log("\n\nMACRO AVERAGE:"); console.dir(macroAverage.fullStats());

But utils.testAndTrain's test function uses macroSum which is confusing. Is it meant to be macroSum in the README or is the function not using the right term?

Also, not related to this but would it be a good idea to add (an optional) randomization to the partitions (e.g.: like how train-test-split does it)?