The core change is the addition of evaluation metrics for multiclass classification problems. Other ad hoc changes were added, such as removing some debug println(…) messages that were outputted to the results file. Further details about the changes can be found below as well as the tests to verify the evaluation metrics implemented.
DenseInstance.scala and SparseInstance.scala
Previously in def apply(index: Int): Double 0.0 was returned when the index was invalid or did not exist. However, it is more appropriate to return NaN as 0.0 can be interpreted as a valid value.
ClusteringEvaluator.scala, Evaluator.scala and BasicClassificationEvaluator.scala
Removed method def getResults() which only had a placeholder implementation.
Included ExampleSpecification attribute
More about the changes in the BasicClassificationEvaluator.scala are present in a separate section
streamDMJob.scala
Removed debug println(…)
FileReader.scala
Removed debug println(…) and unused imports
EvaluatePrequential.scala
EvaluatePrequential is now setting the ExampleSpecification. Access to ExampleSpecification is useful for Evaluator derived classes as they can, for example, infer the number of classes in the problem.
BasicClassificationEvaluator.scala
Add several multiclass evaluation metrics implementations (e.g. precision, recall, …).
There is now a specific header for multiclass problems.
Two new Options were added to control whether or not the per class statistics and/or the confusion matrix are added to the output (these options do not apply to binary problems).
The ExampleSpecification attribute is used to discover the number of classes of the problem.
Tests
These tests use the normalized cover type dataset. Instructions to obtain the dataset and prepare for the tests:
The core change is the addition of evaluation metrics for multiclass classification problems. Other ad hoc changes were added, such as removing some debug println(…) messages that were outputted to the results file. Further details about the changes can be found below as well as the tests to verify the evaluation metrics implemented.
DenseInstance.scala and SparseInstance.scala
def apply(index: Int): Double
0.0 was returned when the index was invalid or did not exist. However, it is more appropriate to return NaN as 0.0 can be interpreted as a valid value.ClusteringEvaluator.scala, Evaluator.scala and BasicClassificationEvaluator.scala
def getResults()
which only had a placeholder implementation.streamDMJob.scala
FileReader.scala
EvaluatePrequential.scala
BasicClassificationEvaluator.scala
Tests
These tests use the normalized cover type dataset. Instructions to obtain the dataset and prepare for the tests:
../data
under the streamDM project directory.OUTPUT: Avg statistics + per class statistics + confusion matrix (full output)
./spark.sh "200 EvaluatePrequential -l (trees.HoeffdingTree -l 0 -t 0.05 -g 200 -o) -s (FileReader -f ../data/covtypeNorm.arff -k 5810 -d 10 -i 581012) -e (BasicClassificationEvaluator) -h" 1> result_COVT.txt 2> log_COVT.log
OUTPUT: Avg statistics + confusion matrix (no per class statistics)
./spark.sh "200 EvaluatePrequential -l (trees.HoeffdingTree -l 0 -t 0.05 -g 200 -o) -s (FileReader -f ../data/covtypeNorm.arff -k 5810 -d 10 -i 581012) -e (BasicClassificationEvaluator -c) -h" 1> result_COVT_noPerclass.txt 2> log_COVT_noPerclass.log
OUTPUT: Avg statistics + per class statistics (no confusion matrix)
./spark.sh "200 EvaluatePrequential -l (trees.HoeffdingTree -l 0 -t 0.05 -g 200 -o) -s (FileReader -f ../data/covtypeNorm.arff -k 5810 -d 10 -i 581012) -e (BasicClassificationEvaluator -m) -h" 1> result_COVT_noConfMat.txt 2> log_COVT_noConfMat.log
OUTPUT: Avg statistics only
./spark.sh "200 EvaluatePrequential -l (trees.HoeffdingTree -l 0 -t 0.05 -g 200 -o) -s (FileReader -f ../data/covtypeNorm.arff -k 5810 -d 10 -i 581012) -e (BasicClassificationEvaluator -c -m) -h" 1> result_COVT_onlyAvg.txt 2> log_COVT_onlyAvg.log