Open indigorose1 opened 9 years ago
ARI is a metric for similarity between clusters - or accuracy of assigning vertices to correct groups (but without assigning labels). This makes it is a prime candidate for a loss function in which to minimize.
So how does the ARI compare in performance to other metrics, especially simpler methods of differences? Is there a way to determine the accuracy of these tests? I'm thinking of a way to cross-validate, possibly with random permutations. What does everyone think?
In terms of ARI, couldn't you rerun the clustering techniques ten times and then computing t-test to determine if the difference is statistically significant? Simply, it would seem to be beneficial to use cross validation when comparing significance of ARI.
I'm still not super clear about how the adjusted rand index fits into the loss function.