Performance of diameter

sbromberger / LightGraphs.jl

An optimized graphs package for the Julia programming language

Other

671 stars 184 forks source link

Performance of diameter #1516

Open bkamins opened 3 years ago

bkamins commented 3 years ago

I have a gh graph on 37700 vertices and 289003 edges.

Calculation of its diameter like this:

maximum(maximum(gdistances(gh, i)) for i in vertices(gh))

takes around 250 seconds (probably this is not an optimal algorithm for the task but at least it works).

However, when I run diameter(gh) the process takes so long that I did not wait till it finished.

bkamins commented 3 years ago

This is not a bug but performance issue, but a bug label was automatically added.

simonschoelly commented 3 years ago

I think this might be, because diameter works for the general case where an additional distance matrix is used. Internally it then uses dijkstra_shortest_paths instead of gdistances. Maybe we should have a special case for when no distance matrix is used.

bkamins commented 3 years ago

In general using gdistances is also not optimal I think. igraph is much faster on this graph in calculation of diameter. I think that in general either e.g. Floyd-Warshall or Johnson's algorithm should be used (depending on the sparsity of the graph), but unfortunately currently I have too much work to get DataFrames.jl to 1.0 release to dig into LightGraphs.jl sources - sorry for this. (or maybe there are some special algorithms that are custom made for diameter - I have not checked)

simonschoelly commented 3 years ago

I am not sure if Floyd-Warshall is much more performant unless one finds a way to avoid allocating the temporary |V| x |V| distance matrix. Best would probably to just do some benchmark with different algorithms & graphs, if someone got time for that.

There might also be the possiblity for a parallel diameter algorithm

sbromberger commented 3 years ago

All-source bfs should be faster than FW for unweighted graphs. For weighted I think dijkstra will still beat FW but tests should confirm.

And we should definitely specialize distance metrics for unweighted.

bkamins commented 3 years ago

igraph does BFS. F-W allocates but if graph is dense this should not be a problem as then |V| cannot be large anyway.

gdalle commented 3 years ago

Would something like https://who.rocq.inria.fr/Laurent.Viennot/road/papers/ifub.pdf be overkill for our purposes?

Also, when you talk about distance metrics to specialize, what do you have in mind besides the diameter?

sbromberger commented 3 years ago

Also, when you talk about distance metrics to specialize, what do you have in mind besides the diameter?

Sorry for the delayed response. In general, if you have unweighted graphs, where we're currently using Dijkstra, we can simply substitute BFS. There's no need to track weights in an unweighted graph, and BFS is more performant specifically because we don't need to do these comparisons at each traversal.