MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
The measureCollectionType option in the EvaluateClustering task (line 46) purports to allow the user to select which measures they want to use in evaluating their learner's performance. This, however, is not properly implemented.
The BatchCmd getMeasureSelection method (line 88) takes an integer argument specifying the measure collections to select, but then adds the same measure collections regardless of argument. These are EntropyCollection, F1, General, SSQ, SilhouetteCoefficient and StatisticalCollection.
Additionally, when the MeasureCollection returned by BatchCmd getMeasureSelection is passed to the BatchCmd getMeasures method (line 202) there is no check of whether the measures are enabled - instead the measure collection's default enabled values are used. This makes it impossible to add measure collections whose default enabled value(s) is (are) false.
The measureCollectionType option in the EvaluateClustering task (line 46) purports to allow the user to select which measures they want to use in evaluating their learner's performance. This, however, is not properly implemented.
The BatchCmd getMeasureSelection method (line 88) takes an integer argument specifying the measure collections to select, but then adds the same measure collections regardless of argument. These are EntropyCollection, F1, General, SSQ, SilhouetteCoefficient and StatisticalCollection.
Additionally, when the MeasureCollection returned by BatchCmd getMeasureSelection is passed to the BatchCmd getMeasures method (line 202) there is no check of whether the measures are enabled - instead the measure collection's default enabled values are used. This makes it impossible to add measure collections whose default enabled value(s) is (are) false.