Hoosier-Clusters / clusim

An extended package for clustering similarity
MIT License
63 stars 15 forks source link

Return NaN value on node with no membership. #35

Closed jisungyoon closed 4 years ago

jisungyoon commented 4 years ago

Hi, I am trying to use Clusim as the performance measure of overlapping community detection.

In overlapping community detection, we compare the ground-truth communities and obtained communities. Sometimes, some nodes do not have any membership on the ground truth and obtained communities as well. In this case, Clusim returns NaN value.

I think I can pre-process data set before passing to Clusim, but it would be nicer if Clusim can deal with this kind of problem.

Thank you!

ajgates42 commented 4 years ago

Hi @jisungyoon, clusterings should always be defined over all data elements. None of the methods here are applicable to clusterings over different element sets. Traditionally, if your method doesn't cluster all elements, you can add the extra elements in singleton clusters or grouped into one giant cluster. This has motivated me to more properly document the Clustering errors. Now you cannot create a Clustering if it breaks the definition.
To help, I've also added a function to clugen that fills in the missing element assignments: clugen.cluster_missing_elements