[QST]: If the Louvain algorithm in cugraph considers a directed graph as a symmetric matrix?

rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library

https://docs.rapids.ai/api/cugraph/stable/

Apache License 2.0

1.69k stars 301 forks source link

[QST]: If the Louvain algorithm in cugraph considers a directed graph as a symmetric matrix? #4626

Closed darius513 closed 1 month ago

darius513 commented 2 months ago

What is your question?

I am impressed with how cugraph's Louvain algorithm approaches modularity similarly to the serial algorithm. Could you please clarify if the Louvain algorithm in cugraph considers a directed graph as a symmetric matrix (only entries in the lower triangle are considered)? Would this operation affect the modularity value?

Code of Conduct

[X] I agree to follow cuGraph's Code of Conduct
[X] I have searched the open issues and have found no duplicates for this question

ChuckHastings commented 2 months ago

The current implementation of Louvain in cugraph only operates on an undirected graph (although I notice that we don't fail in the C++ code if you specify a directed graph). We represent an undirected graph in cugraph as a symmetric matrix.

I have downloaded a paper on adapting Louvain to operate on a directed graph, but we have not explored adapting our Louvain implementation to handle this.

I am relatively comfortable in asserting that our current implementation will work poorly on an asymmetric graph, although I haven't tested this for Louvain.

darius513 commented 2 months ago

Thank you for your response. According to the introduction of the Louvain method in cugraph, when handling directed graphs, cugraph converts a directed graph into an undirected graph by creating a cugraph.Graph object. How does cugraph achieve this conversion? I currently know of two approaches:

Taking the lower triangular matrix of the directed graph (ignoring the edges in the upper triangular part) — as in Grappolo.
If either the directed edge (u, v) or the directed edge (v, u) exists, then there is a bidirectional edge between vertex u and vertex v — as in DiGraph.

ChuckHastings commented 1 month ago

Short answer... option 2.

The CSR-like structure we use in the C++ implementation of libcugraph is directed. We have no mechanism to discover and traverse incoming edges. When we create an undirected graph, there is a flag in the C++ graph creation logic that requests that the adjacency matrix be made symmetric. When this occurs, we execute the following low level logic (more efficiently than I'm describing):

For every edge (u,v) in the edge list, insert both (u,v) and (v,u) into the final edge list for the graph
Then create the graph. There is an optional flag in graph creation that will eliminate multi-edges, which will address the case where both (u,v) and (v,u) are both present in the original edge list - so that the edge will only appear once in the graph.

When you create the cugraph.Graph class in python, there is an equivalent feature (it will eventually become the same feature).