graphology / graphology

A robust & multipurpose Graph object for JavaScript & TypeScript.
https://graphology.github.io
MIT License
1.31k stars 86 forks source link

Calculating communities and retrieving the modularity measure without calculating communities twice #502

Open deemeetree opened 10 months ago

deemeetree commented 10 months ago

Suppose I want to detect and assign the community structure of a graph to the nodes and to retrieve the modularity measure.

Currently, the documentation proposes these two functions:

In the browser (btw hard to find documentation on this):

graphologyLibrary.communitiesLouvain.assign(graph, {
    randomWalk: false,
    getEdgeWeight: 'weight',
});

const modularityMeasure = graphologyLibrary.communitiesLouvain.detailed(graph).modularity;

In node.js:

// To directly assign communities as a node attribute
louvain.assign(graph);

// If you want to return some details about the algorithm's execution
const modularityMeasure = louvain.detailed(graph).modularity;

Doesn't it mean I run the community partition twice?

Is there a more efficient way to calculate the partition, assign it to the nodes, and retrieve the modularity measure?

Thanks!

Yomguithereal commented 10 months ago

Hello @deemeetree,

Doesn't it mean I run the community partition twice?

Yes it does mean you are running the partition twice indeed.

Unfortunately the library does not expose a way to both assign the communities to the graph in the same time as returning the details. But there exist 2 workarounds for now:

  1. compute the detailed version of the results and use the community mapping to assign nodes the relevant community.
  2. assign to the graph then use the modularity computation to get the metric. This is probably not much a performance hit because when using the detailed method, the index still has to compute the actual modularity at the end (the index does not need to store it at all while running, we are only working on computed deltas).
deemeetree commented 10 months ago

Hello @Yomguithereal

Thank you for your response.

So what do you think would be the most optimal solution?

As I understand, the first option i get the detailed results (with modularity), then need to reiterate through the graph and assign the community values manually.

In the second option, I didn't quite understand: you suggest I calculate community using the louvain method above and then calculate it again using the community metric? Will the result be the same as in louvain then?

Wouldn't it make sense to expose a method which would both assign and get the results out?

Yomguithereal commented 10 months ago

So what do you think would be the most optimal solution?

The second one. Especially if you only need to retrieve the modularity score.

In the second option, I didn't quite understand: you suggest I calculate community using the louvain method above and then calculate it again using the community metric? Will the result be the same as in louvain then?

No, the modularity metric is just a continuous score, not a community partition.

Wouldn't it make sense to expose a method which would both assign and get the results out?

It could indeed, but I don't know what would be the good solution here. Because in your case you want only the modularity score, but some people might want to assign the community and retrieve the dendrogram for instance, so I am not sure, currently, what would be the useful thing to implement yet.