Notes space for comparing different clustering methods

aarmey commented 2 years ago

Clustering methods should be compared using adjusted_mutual_info_score. It's adjusted for random labelling to always be zero. It's also symmetric and independent of permutation.

Methods:

KMeans (hard)
Affinity Propagation (hard)
Birch (hard)
GMM
OPTICS (hard)
DBSCAN (hard)
Spectral (hard)
MeanShift (hard)
Hierarchical/Agglomerative (hard)

All hard methods have the X.labels_ attribute.

My thinking is we can run all of these, then calculate the pairwise distances by adjusted_mutual_info_score. Those distances can then be fed into MDS (precomputed) to visualize how similar they all are to each other. I bet they all end up pretty similar and GMM matches DDMC at weight 0. DDMC then becomes quite different and higher weights.

mcreixell commented 2 years ago

Sounds great I'll work on that. Thank you.

mcreixell commented 2 years ago

Just read you're available to implement this. Thanks!

meyer-lab / DDMC

Notes space for comparing different clustering methods #508