Open KeRNeLith opened 4 years ago
Here is a thing. I used to create a centrality approxmation as follows: 1) Find all strongly-connected components 2) From each of strongly-connected components do following: 1) Take a random node and place cursor (current execution unit) on it 2) Find it's longest path among all it's shortest paths 3) Step into the longest node on K percent of path length to it. Here what it means. Imagine you have a path A->....->B of 10 nodes in total. If your K is 0.9 then you will change your cursor to 9'th node in that path. Btw by doing this you also find eccentricity of a cursor. 4) Do 2 and 3 step on cursor and move it accordingly meanwhile reducing K until you step in the longest path only by one node. 5) When your cursor hit the same node twice (so you stepping in node that have been cursor before) you are node 3) From all results save nodes with lowest eccentricity.
What is the logic of this approach? If you take random node and find it's eccentricity, then direction in which the longest shortest path is looking kinda always directs to a center, so by stepping into longest shortest path you on avarage always find a node with lowest eccentricity.
Why not always step just by one node, but decreasing step size? Because it is simulated annealing approach to cover most of solution space and so approximation like that works the best.
Why do we care about strongly connected components? Because if we randomly choose nodes and do the approximation from them most of the path's they produce will be the same, but different strongly connected components cannot do that. It is just a way to approximate nodes that works the best.
I have an implementation which is bad commented but works fine and with what I wrote above it must make a bit more sense
Implement the Centrality Approximation algorithm and unit test it.
The library had a draft of implementation for this algorithm, here is the basis: