twitter / cassovary

Cassovary is a simple big graph processing library for the JVM
http://twitter.com/cassovary
Apache License 2.0
1.05k stars 150 forks source link

Improve performance of Closeness Centrality #174

Open pankajgupta opened 9 years ago

pankajgupta commented 9 years ago

Currently the code performs a new Breadth-first-search for every node. Improve its performance.

pankajgupta commented 9 years ago

Benchmark results reported in #173 show actual absolute numbers on speed.

pankajgupta commented 9 years ago

One potential algorithm is to first find strongly connected components of the given graph G (http://en.wikipedia.org/wiki/Strongly_connected_component). All the nodes of a single SCC must have the same value of number of nodes reachable. So consider the condensation of G which is a DAG. and obtain the number of reachable nodes from each node in this DAG (perhaps by doing modified breadth-first-search traversals starting from nodes with no incoming edges).

There might be other / better algorithms around.

bmckown commented 9 years ago

+1