Adding the calculation for precision, recall, fbeta-score, ...

This change includes the addition of precision, recall, specificity, and fbeta-score, to BasicClassificationEvaluator.scala. All these metrics are relevant for evaluating imbalanced classification problems.

~It is possible to configure the beta value for fbeta-score in the EvaluatePrequential.scala (parameter -b), the way it is passed to BasicClassificationEvaluator is by using a generic dictionary of parameters. This approach can be used to pass specific parameters to Evaluators.scala descendants without disrupting the general interface.~ The beta hyperparameter has been moved to BasicClassificationEvaluator. By doing that there is no need to include a dictionary of parameters or anything like that. The tests were updated as well.

The current version extends the existing metric calculation in BasicEvaluationPrequential, therefore it is based on a single confusion matrix and so far it is not possible to properly evaluate multi-class problems.

A future adaptation to this BasicClassificationEvaluator should include a way to properly calculate the multi-class versions of the aforementioned metrics (e.g. using macro and micro average).

Another small change within BasicClassificationEvaluator is that the internal representation of the confusion matrix was changed from a (Double, Double, Double, Double) to a Map[String, Double], therefore it is less likely to incorrectly use, for example, fn instead of fp as one have to explicitly state something like x{"fn"} instead of x._1.

Tests

Using f0.5-score ./spark.sh "EvaluatePrequential -l (org.apache.spark.streamdm.classifiers.bayes.MultinomialNaiveBayes) -e (BasicClassificationEvaluator -b 0.5) -h" 1> log_fbeta_05.txt 2> error.txt
Using f1.0-score (default configuration, no need to set -b) ./spark.sh "EvaluatePrequential -l (org.apache.spark.streamdm.classifiers.bayes.MultinomialNaiveBayes) -h" 1> log_f1.txt 2> error.txt
Using f2-score ./spark.sh "EvaluatePrequential -l (org.apache.spark.streamdm.classifiers.bayes.MultinomialNaiveBayes) -e (BasicClassificationEvaluator -b 2.0) -h" 1> log_fbeta_2.txt 2> error.txt

huawei-noah / streamDM

Adding the calculation for precision, recall, fbeta-score, ... #69