Waikato / moa

MOA is an open source framework for Big Data stream mining. It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation.
http://moa.cms.waikato.ac.nz/
GNU General Public License v3.0
609 stars 353 forks source link

CluStream.WithKmeans: How about changing the macro and micro clusters ids when getClusteringResult() is called, to show the labeling? #195

Open onofricamila opened 4 years ago

onofricamila commented 4 years ago

Hey! I was diving into the source code of moa/moa/src/main/java/moa/clusterers/clustream/WithKmeans.java, and found out that it would be really useful if right after the user calls getClusteringResult(), the macro clusters returned have a unique id (current id is -1 for all of them), and then when the user calls getMicroClusteringResult(), these ones have a coherent id too, representing the former labeling (current micro clusters id is -1 for all of them) ...

It seems that the moa/moa/src/main/java/moa/cluster/Cluster.java setId() method is not used and I do not know if that is on purpose (like for some reason).

Having the micro clusters labeled would be useful for calculating metrics to see how well the clustering resulted, for plotting different groups with different colors, and so on.

If you think what I remarked is correct, I will submit a PR.

Any feedback is welcomed. Thanks!!