Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
610 stars 353 forks source link

BatchCmd does not recognize EvaluateClustering's encoding of "no instance limit" as -1 #114

Closed richard-moulton closed 6 years ago

richard-moulton commented 6 years ago

The instanceLimitOption in the EvaluateClustering task (line 43) describes its function as

Maximum number of instances to test/train on (-1 = no limit).

If "-1" is passed as an argument, however, (for example if the streamOption is set as a FileStream and the user wants the whole ARFF file processed) then no instances are passed to the learner and the results produced in the dumpFile are NULL outside of the header.

This is because the BatchCmd run method (line 147) uses the following while loop to determine if another instance should be passed:

while(m_timestamp < totalInstances && stream.hasMoreInstances())

The totalInstances variable is the local representation of the user's selected value for EvaluateClustering instanceLimitOption, but it is created with no understanding of the meaning of "-1." Instead totalInstances is set to -1 in the BatchCmd constructor method (line 68) and then the condition in the above while loop immediately evaluates as false.

richard-moulton commented 6 years ago

Pull request #115 was accepted and merged; the issued has been addressed.

rwandodo commented 6 years ago

hello richard-moulton

can you send me your email? i have some questions for you please? rwando@hotmail.com