andompesta / ComE

Implementation of ComE algorithm
59 stars 32 forks source link

How to detect communities and measure similarity between communities? #5

Open rann1018 opened 4 years ago

rann1018 commented 4 years ago

Hello, Thanks a lot for your work. I run your code successfully and found that the final .txt file only contained node embedding feature. How can I get detected communities and how to measure the similarity between communities? I look forward to your reply.

abegehr commented 4 years ago

The detected communities, as represented by a Gaussian Mixture Model (GMM), are accessible at com_learner.g_mixture. See https://github.com/andompesta/ComE/blob/master/ADSCModel/community_embeddings.py#L24

You can add some code to the end of main.py to extract info from the communities after running ComE. To compute predicted labels for nodes (community assignment):

labels_pred = np.array(com_learner.g_mixture.predict(model.node_embedding)).astype(int)

And get the GMM's means and covariances with com_learner.g_mixture.means_ and com_learner.g_mixture.covariances_.

You can also plot communities, nodes, and community assignments using plot_utils. See an example here: https://github.com/abegehr/ComE_BGMM/blob/master/main.py#L166

rann1018 commented 4 years ago

Thanks a lot for your help!

kostaspm commented 1 year ago

The detected communities, as represented by a Gaussian Mixture Model (GMM), are accessible at com_learner.g_mixture. See https://github.com/andompesta/ComE/blob/master/ADSCModel/community_embeddings.py#L24

You can add some code to the end of main.py to extract info from the communities after running ComE. To compute predicted labels for nodes (community assignment):

labels_pred = np.array(com_learner.g_mixture.predict(model.node_embedding)).astype(int)

And get the GMM's means and covariances with com_learner.g_mixture.means_ and com_learner.g_mixture.covariances_.

You can also plot communities, nodes, and community assignments using plot_utils. See an example here: https://github.com/abegehr/ComE_BGMM/blob/master/main.py#L166

I assume the order of the detected communities on the labels_pred is adjacent to the nodes of model.vocab ? I need to collect the communities to perform various fairness definitions on nodes attributes.

abegehr commented 1 year ago

@kostaspm, I assume you're referencing this line: labels_pred = np.array(com_learner.g_mixture.predict(model.node_embedding)).astype(int) The labels in labels_pred are indexes parallel to the nodes in model.node_embedding. I would assume that model.node_embedding are indexed parallel to model.vocab, since the Node2Vec sentences are generated from model.vocab: https://github.com/andompesta/ComE/blob/e1101c45e9fd29025389d6cc243b603d9c7d33dc/utils/embedding.py#L126