Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
610 stars 353 forks source link

ArrayIndexOutOfBoundsException with an ARFF class index different from -1 in Perceptron #147

Open festradasolano opened 6 years ago

festradasolano commented 6 years ago

An ArrayIndexOutOfBoundsException is thrown when using an ARFF stream with a class index different from the last one (i.e., -1) for training a Perceptron regressor (moa.classifiers.rules.functions.Perceptron) through MOA API. The AdaptiveNodePredictor regressor (moa.classifiers.rules.functions.AdaptiveNodePredictor) exhibits the same issue since it uses the Perceptron code.

Particularly, the bug is inside the method trainOnInstanceImpl(Instance inst) of the Perceptron class. Check that in lines 184-185, the class attribute numericAttributesIndex stores the indexes of the numeric input attributes, that is, without the class index. For example, for a dataset with 5 numeric attributes where the class is set to the 2nd column, the value of numericAttributesIndex is [0, 2, 3, 4] (1 is the index of the class).

However, in line 207, the method modelAttIndexToInstanceAttIndex adds 1 to the value in numericAttributesIndex if the index is greater than the class index. In the above example, this method will return 0, 3, 4, and 5, respectively. Nevertheless, 4 is the maximum index of the array of attribute values of the instances because the dataset has 5 attributes (i.e., [0, 1, 2, 3, 4]). Therefore, the index 5 will throw an ArrayIndexOutOfBoundsException.

Hope this helps!