Girvan-Newman (Shortest-path betweenness), 2004 [11] |
Divisive |
Ο(nm2): Each iteration uses a tree structure to calculate edge betweeness of a graph in O(mn). Do this m times, once for each edge |
D,H,G |
Mathematica |
w,-d |
Iteratively removes links with the highest betweenness score (a score that measures the number of shortest paths that pass through an edge) |
Girvan-Newman (random-walk betweenness), 2004 [11][16] |
Divisive |
Ο((n+m)mn2) |
|
Uses random traversal through a graph |
Girvan-Newman (current-flow betweenness), 2004 [11][16] |
Divisive |
Ο((n+m)mn2) |
|
Uses a resistor network concept |
Infomaps[13] |
Model Based/Information Theory/ Random Walk |
Ο((n2)Log(n)) |
snap, igraph |
|
w,d |
Attempts to find partition that yields the minimum description length of an infinite random walk on the network. It determines community and network structure by analyzing the flow of information, proxied through random walk calculations, among various groups of nodes. |
Fast Newman [12] |
Quality Optimization |
O((m + n)n) |
A,H |
|
Snap |
w,-d |
Optimize Q over all divisions using agglomerative hierarchical clustering |
Clauset-Newman-Moore [9] |
Model Based, Quality Optimization |
O(md log n) where d is depth of dendrogram |
|
Snap, igraph |
Improves upon Fast Newman, remains based oon the greedy optimization of modularity. A modularity of >0.3 is a good indicator of a significant community structure |
BIGCLAM (Cluster Affilitation Model for Big Networks)[14] |
Model Based |
Unclear |
O,H,G,Q,R+ |
Snap |
-w,-d |
Partitioning that most likely produces the edges of a given graph, given the notion that nodes in the same community are more likely to share an edge. It captures the probability that a pair of nodes are connected as a funciton of that shared membership. http://www.youtube.com/watch?v=htWQWN1xAZQVideo Lecture |
Label Propogation, 2007 [17] |
Vertex Clustering |
|
A |
|
Label Propagation uses the structure of the graph to cluster vertices. Nodes are iteratively labeled with the label that most of its neighbors holds. Densely connected groups form a consensus with a unique label. |
Fast Unfolding, 2008 [19] |
Quality Optimization |
H |
igraph |
|
w,-d |
An iterative algorithm that calculates modularity based on nearest neighbros, joining nodes if the values are positive, and then builds a new network whose nodes are now communities based on step one. This it repeated until maximum modularity is achieved. |
WalkTrap, 2005 [18] |
Vertex Clustering, Quality Optimization |
O(mn2) |
A |
igraph |
w,-d |
Defines a similarity metric between nodes based on distance to all other nodes from a random walk. Merges nodes in an agglomarative fashion that minimizes distance from other nodes in the community. Then picks partition that has highest modularity. |
CESNA (Communities from Edge Structure and Node Attributes) [26] |
Subgraph Discovery |
1 million nodes in 10 hours |
Q,R+ |
Snap |
-w,-d |
Detects overlapping communities in networks with node attributes. Assumes that nodes that belong ot the same communities are likely to be connected to each other, communities can overlap, the more common communities a set of nodes share the more likely they are to be connected, and nodes in the same community are likely to share common attributes |
Radicchi et al [15] |
Model Based, Divisive |
O(m2) on large systems |
None |
|
-d |
A local algorithm that returns similar results to Girvan Newman betweenness but reduces computational cost by using a local quantitiy--the number of loops of a given length that contains an edge--to choose which edge to remove. |
Leading Eigenvector, 2006 [20] |
|
|
igraph |
Latent Space Models [21] |
|
CoDA (Communities through directed affiliations)[23] |
Model Based |
Unclear |
O,H,G,Q,R+ |
snap |
-w,d |
Extension of BIGCLAM that incorporated directedness to discover community type. It detects overlapping communities as wel as cohesively and 2-mode communities, whether connected or hierarically nested. |
Clique Clique Percolation [27] |
Cohesive Subgraph Discovery |
G,D,S |
CFinder |
|
-w,-d |
Begins by identifying all cliques of size k in a network. All cliques are collapsed into single vertex. Vertices are connected if cliques that they are representing share k-1 members—connected components identify which cliques compose a community. Input value of k must be chosen and typically is between 3 and 6. |
CONGA (Cluster-Overlapping Newman Girvan Algorithm), 2007 [28] |
Divisive |
O(m3) |
D,G,O |
None |
-w,-d |
CONGA, like Girvan Newman, relies on calculation of betweenness to determine splits, but splits on vertices, rather than edges. The algorithm specifies when and how to split vertices allowing for overlapping membership |
CONGO (CONGA Optimized), 2008 [32] |
Divisive |
O(n log n) Sparse Networks |
D,G,O |
None |
-w,-d |
Expands upon the CONGA algorithm by achieving a local form of betweenness |
Fuzzy Overlapping Group (FOG) [29] |
Divisive |
O(L3) |
O |
ORA |
|
A combination of a stochastic model and maximum likelihood method group detection algorithm for inferring fuzzy overlapping groups. Groups are built from links which allows for multiple memberships varying levels of association from entities to groups. |
Bipartite Stochastic Block Model (biSBM), 2014 [30] |
Model Based |
O(NaKa(Ka + k)) + O(NbKb(Kb + k)) |
Matlab |
w |
The degree corrected Bipartite Stochastic Block Model uses probability to search for maximum likelihood partition of a network into communities. It models a network's observed degree sequence before finding community structure. It divdes verticies into groups of like types. These calculations are done iteratively to continually propose movements of vertices that increases the likelihood function. In ref to run time, k is avg degree of each node, N_a and N_b or number of vertices of type a and b, K_a and K_b are number of communities of type a and b |
Spectral Clustering [31] |
Model Based |
Determined by Eigenvalue Computation |
L,Q,R |
w,d |
Spectral clustering algorithms rely on the calculation of eigenvalues of a similarity matrix to find optimal partitions, give a predetermined number of partitions. It relaxes the complex problem of minimizing cut ratio over every possible k-partition to find the k-smallest eigenvalues and related eigenvectors of the Laplacian of the graph. |
RolX |
Role identification/discovery |
Q |
SNAP |
-w,-d |
Generates list of roles as "profiles" of local/global node metrics, and assigns nodes a linear combination of these roles. |