Hoosier-Clusters / clusim

An extended package for clustering similarity
MIT License
63 stars 15 forks source link

AssertionError #31

Closed tgj505 closed 5 years ago

tgj505 commented 5 years ago

Hello! From what I can tell, CluSim only compares clusterings with the same number of elements. Is this correct?

I'm trying to find a relatively efficient way to find a generalized similarity measure between two rather large graphs with different cardinality of nodes, and your wonderful CluSim package seemed to fit the bill until I couldn't get past the recurring AssertionError for most of the similarity measures:

File "/anaconda3/lib/python3.6/site-packages/clusim/sim.py", line 67, in contingency_table assert clustering1.n_elements == clustering2.n_elements

Thanks for putting this together, and if you have any suggestions I'd be interested.

yy commented 5 years ago

Hi! Yes, it assumes different clusterings on the same set of elements. If one clustering has more nodes, then it will not work. One non-ideal (and can be wrong) way to deal with that situation is only considering the elements that appear in both clusterings.

tgj505 commented 5 years ago

Thanks for the quick response! That's what I had thought but wanted to double check.