Open nobrakal opened 6 years ago
Of course graph-creation will be slightly longer, maybe allow the user to choose (between a more "alga" structure and a quick one) can be a good idea.
For comparison, the creation of a clique with 1000
vertices from a list of edges with Fgl.PatriciaTree
takes 4
seconds, meanwhile for the actual Algebraic.Graph
it takes 500 ms
One little but significant advantage of the current implementation of edges
is that it has no constraint on the type of vertices a
, i.e. it's fully polymorphic edges :: [(a, a)] -> Graph a
. What you suggest will require Ord a
, ruling out some graphs, e.g. whose vertices are functions.
I think it's reasonable to add a new function, perhaps called edgesOrd
, that will rely on Ord a
to make the resulting expression smaller.
In general, this is an interesting research question, how to efficiently turn a list of edges into a compact algebraic representation. One of the most promising approaches is to use modular decomposition.
So concerning edgesOrd
, I made some experiments:
edgesOrd :: Ord a => [(a, a)] -> Graph a
edgesOrd = Map.foldrWithKey (\k v acc -> Overlay acc $ Connect (Vertex k) $ vertices1 v) Empty . Map.fromListWith (<>) . map (fmap return)
where
vertices1 (x :| xs) = foldr (Overlay . vertex) (vertex x) xs
It build an adjacency map and convert it to a Graph
:
edgesOrd
ensure a size x == edgeCount x + vertexCount x
while edges
give size x == 2 * edgeCount x + 1
Then benchmarks are rather good, except for some functions ^^:
vertexList: 0.67 (good)
isEmpty: 359.36 (bad)
vertexCount: 0.66 (good)
hasVertex: 84714.52 (bad)
edgeCount: 0.70 (good)
edgeList: 0.73 (good)
hasEdge: 0.58 (good)
hasSelfLoop: 0.58 (good)
addEdge: 1.15 (bad)
addVertex: 1.21 (bad)
removeVertex: 0.68 (good)
equality: 0.81 (good)
removeEdge: 0.48 (good)
transpose: 1.33 (bad)
dff: 0.95 (OK)
topSort: 0.96 (OK)
edges: 10.23 (bad)
So the main comment is that it is a good idea, but can lead to some problems (see hasVertex
). While still being faster than Fgl
(Fgl takes 3.2s
seconds to build a clique with 1000 vertices while using edgesOrd
alga takes only 500ms
).
hasVertex
can be optimized in hasVertexOrd
and I think others operations too, but this require to store somewhere the fact that the graph was created with edgesOrd
, because it give a predictable structure.
@nobrakal Interesting :) I think your implementation is not lazy enough (you build the whole map before producing the graph) and also tries to do too much work by eliminating duplicated edges, which leads to significant slowdown on some graph algorithms. Does switching to Data.Map.Lazy
help?
What about trying an intermediate solution, where you build something like Map a [a]
, where the lists are allowed to have duplicates?
Perhaps, you could even use something simpler like group
possibly after sort
:
https://hackage.haskell.org/package/base-4.11.1.0/docs/Data-List.html#v:group
Keep in mind that Map
is an expensive data structure, which supports fast searching, but you do not need this in edgesOrd
, so you are currently doing more work than necessary.
also tries to do too much work by eliminating duplicated edges ?
Am I trying to remove duplicated edges ? The type of the map is Map a (NonEmpty a)
, why this implicate removing duplicated edges ?
But you are right, I will explore others solutions :)
Am I trying to remove duplicated edges ?
Oops, I didn't spot that you use Map.fromListWith (<>)
-- indeed, this does not eliminate duplicates. Still I have a feeling you are doing more work than necessary by storing intermediate results in a balanced tree.
Yep, you were right, using groupBy
:
edgesEq :: Eq a => [(a, a)] -> Graph a
edgesEq = foldr (\(v,l) g -> Overlay g $ Connect (Vertex v) l) Empty . groupByWithVertices
groupByWithVertices :: Eq a => [(a,a)] -> [(a,Graph a)]
groupByWithVertices [] = []
groupByWithVertices (x:xs) = (fst x, vertices1 x ys) : groupByWithVertices zs
where
eq = on (==) fst
(ys,zs) = span (eq x) xs
vertices1 x = foldr (Overlay . vertex . snd) (vertex $ snd x)
This behave almost like the previous try, but require only an Eq
instance and is 2 times faster.
Because it is the same idea than my previous comment, resulting graphs are the same than in my previous benchmarks, and thus the benchmarks are equivalents for the version of this comment.
@nobrakal Good! So, you are saying with the new version, hasVertex
will still be around 84714x slower? I've no idea where such a huge slowdown comes from! Can you explain?
Here is a detailed benchmark of hasVertex
### Clique
#### 1000
##### 0
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 34.91 ns | 1.000 |
| Alga || 25.00 ms | 0.977 |
+---------++----------+-------+
##### 333
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 7.451 μs | 1.000 |
| Alga || 11.08 ms | 0.913 |
+---------++----------+-------+
##### 666
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 14.59 μs | 0.996 |
| Alga || 1.267 ms | 1.000 |
+---------++----------+-------+
##### 1000
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 16.09 ms | 0.974 |
| Alga || 22.70 ms | 1.000 |
+---------++----------+-------+
### Mesh
#### 1000
##### 0
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 34.83 ns | 0.999 |
| Alga || 40.70 μs | 1.000 |
+---------++----------+-------+
##### 333
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 13.81 μs | 0.999 |
| Alga || 39.08 μs | 0.999 |
+---------++----------+-------+
##### 666
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 20.62 μs | 1.000 |
| AlgaOld || 29.31 μs | 1.000 |
+---------++----------+-------+
##### 1000
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 39.67 μs | 0.998 |
| AlgaOld || 46.94 μs | 1.000 |
+---------++----------+-------+
I am a bit lost since the slowdown really come from "simple" foldg
. hasEdge
is faster with graphs built with edgesEq
for example.
The only explanation I see is that we are building huge Boolean thunks since the graph it a bit more "balanced" than one constructed with edges
. Such a huge drop is still very problematic, it seems very high for only a strictness problem...
Even stranger, for a Clique
the new representation is always worse than something created with edges
...
The only explanation I see is that we are building huge Boolean thunks since the graph it a bit more "balanced" than one constructed with edges .
@nobrakal You can test this hypothesis very easily: just generate a balanced Graph
expression, for example (1 + 1) + (1 + 1)
scaled to a few layers and see how it compares to a linear chain 1 + (1 + (1 + 1))
.
P.S.: Comparing to ((1 + 1) + 1) + 1
is also interesting. I've always wondered how this works, but never tried! So, it's a useful experiment.
Here are some benchs, using:
vertices $ replicate 32768 1
foldg Empty Vertex (flip Overlay) (flip Connect) $ vertices $ replicate 32768 1
benchmarking hasVertex 0/balanced
time 542.5 μs (537.6 μs .. 547.7 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 542.6 μs (537.2 μs .. 547.3 μs)
std dev 16.68 μs (13.36 μs .. 22.13 μs)
variance introduced by outliers: 23% (moderately inflated)
benchmarking hasVertex 0/vertices
time 436.1 μs (427.2 μs .. 448.1 μs)
0.997 R² (0.995 R² .. 0.999 R²)
mean 440.5 μs (435.4 μs .. 447.4 μs)
std dev 20.03 μs (15.94 μs .. 24.05 μs)
variance introduced by outliers: 40% (moderately inflated)
benchmarking hasVertex 0/transposedVertices
time 811.5 μs (801.9 μs .. 824.4 μs)
0.997 R² (0.994 R² .. 1.000 R²)
mean 814.9 μs (807.5 μs .. 827.8 μs)
std dev 31.26 μs (19.64 μs .. 53.27 μs)
variance introduced by outliers: 29% (moderately inflated)
benchmarking hasVertex 1/balanced
time 136.6 ns (135.5 ns .. 138.1 ns)
1.000 R² (0.999 R² .. 1.000 R²)
mean 136.1 ns (135.8 ns .. 136.9 ns)
std dev 1.471 ns (919.7 ps .. 2.818 ns)
benchmarking hasVertex 1/vertices
time 27.42 ns (26.67 ns .. 28.39 ns)
0.995 R² (0.992 R² .. 0.997 R²)
mean 27.50 ns (27.01 ns .. 28.12 ns)
std dev 1.859 ns (1.468 ns .. 2.554 ns)
variance introduced by outliers: 83% (severely inflated)
benchmarking hasVertex 1/transposedVertices
time 604.8 μs (598.1 μs .. 612.0 μs)
0.999 R² (0.998 R² .. 1.000 R²)
mean 597.2 μs (593.9 μs .. 602.6 μs)
std dev 13.72 μs (9.668 μs .. 19.87 μs)
variance introduced by outliers: 14% (moderately inflated)
So the balanced option is less efficient than the one obtained with vertices
(and a reversed structure is not very great ^^).
Also, I spotted something, with the benchs: I was not testing the last vertex of the domain, and this test is faster (the same problem that we had when improving NotEmptyGraph
: the last vertex comes first in the representation):
##### 999
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 13.18 μs | 1.000 |
| AlgaOld || 21.89 μs | 0.999 |
+---------++----------+-------+
But even for the "better positioned" vertex, the time is not so great (comparing to the ~30ns
for the better positioned vertex for actual edges
Your comment about the difference between 1 + ( +1 ( 1 + ... ))
and ( ( ... + 1 ) + 1 ) + 1
leads me to re-considerate the function, and in fact I was producing a somewhere bad representation (like ( ( ... + 1 ) + 1 ) + 1
).
Inverting the two arguments of Overlay
leads to more reasonable results ^^:
### Clique
#### 1000
##### 0
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 33.36 ns | 1.000 |
| Alga || 36.01 ns | 0.999 |
+---------++----------+-------+
##### 333
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 3.837 μs | 1.000 |
| AlgaOld || 7.245 μs | 1.000 |
+---------++----------+-------+
##### 666
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 9.044 μs | 1.000 |
| AlgaOld || 17.36 μs | 1.000 |
+---------++----------+-------+
##### 1000
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 14.12 ms | 0.970 |
| Alga || 22.65 ms | 0.981 |
+---------++----------+-------+
### Mesh
#### 1000
##### 0
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| AlgaOld || 33.34 ns | 1.000 |
| Alga || 33.48 ns | 1.000 |
+---------++----------+-------+
##### 333
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 11.69 μs | 1.000 |
| AlgaOld || 15.84 μs | 1.000 |
+---------++----------+-------+
##### 666
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 22.40 μs | 1.000 |
| AlgaOld || 30.80 μs | 1.000 |
+---------++----------+-------+
##### 1000
+---------++----------+-------+
| || Time | R² |
+=========++==========+=======+
| Alga || 35.34 μs | 1.000 |
| AlgaOld || 47.06 μs | 0.999 |
+---------++----------+-------+
SUMMARY:
* Alga was the fastest 5 times
* AlgaOld was the fastest 1 times
There was 2 ex-aequo
The function used was:
edges :: [(a, a)] -> Graph a
edges = overlays . map (uncurry edge)
edgesEq :: Eq a => [(a, a)] -> Graph a
edgesEq = foldr (\(v,l) -> Overlay (Connect (Vertex v) l)) Empty . groupByWithVertices
groupByWithVertices :: Eq a => [(a,a)] -> [(a,Graph a)]
groupByWithVertices [] = []
groupByWithVertices (x:xs) = (fst x, vertices1 x ys) : groupByWithVertices zs
where
eq = on (==) fst
(ys,zs) = span (eq x) xs
vertices1 x = foldr (Overlay . vertex . snd) (vertex $ snd x)
I am currently running the whole suite to see if it is introducing other problems !
vertexList: 0.71 (good)
isEmpty: 1.00 (OK)
vertexCount: 0.65 (good)
hasVertex: 0.83 (good)
edgeCount: 0.60 (good)
edgeList: 0.58 (good)
hasEdge: 1.64 (bad)
hasSelfLoop: 0.57 (good)
addEdge: 1.07 (OK)
addVertex: 1.15 (bad)
removeVertex: 0.69 (good)
equality: 0.56 (good)
removeEdge: 0.49 (good)
transpose: 0.83 (good)
dff: 1.08 (OK)
topSort: 0.99 (OK)
edges: 2.15 (bad)
Now hasEdge
is not so happy... I will try with #97
Comparing the #97 modification plus edgesEq
and only the #97 modification, I get a:
hasEdge: 0.75 (good)
So I think all seems good here
@nobrakal So, speaking very roughtly, the algorithms get a small (say 30%) speed up at the cost of 2x slower graph construction. I guess this is a good tradeoff, but I hope we can find further improvements for the edges
function, so that it achieves better compression.
Yep, this is certainly possible, requiring an Ord
instance.
I played a bit, and came to an horrible function I won't show here (one can find it at: https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L313 ).
Roughly the idea I had was to build an adjacency map, and this time, to use it: Each time I am trying to connect two vertices x and y, I search if the two are connected to some vertices, c d e, so I can create:
x * ( y * (c + d + e))
So this is an improvement from the first version because it reduce the duplication of vertices c d and e.
So it compresses greatly:
*Algebra.Graph> let x = edgeList $ clique [1..1000] :: [(Int,Int)]
*Algebra.Graph> length x
499500
*Algebra.Graph> size $ edges x
999001
*Algebra.Graph> size $ edgesEq x
500500
*Algebra.Graph> size $ edgesOrd x
251498
But, sadly, I cannot come to very descent performances:
edges: 35.71 (bad)
Using specialized IntMap
and IntSet
:
edges: 11.17 (bad)
The benchmarks:
vertexList: 0.50 (good)
isEmpty: 0.96 (OK)
vertexCount: 0.47 (good)
hasVertex: 48439.75 (bad)
edgeCount: 0.56 (good)
edgeList: 0.55 (good)
hasEdge: 0.48 (good)
hasSelfLoop: 0.40 (good)
addEdge: 0.75 (good)
addVertex: 0.78 (good)
removeVertex: 0.51 (good)
equality: 0.44 (good)
removeEdge: 0.43 (good)
edges: 12.28 (bad)
Again an issue with hasVertex
otherwise numbers are rather but not extraordinary better.
I don't know if it worth investigating and optimizing this solution
Ok, I have managed to find a better version of edgesOrd
:
vertexList: 0.53 (good)
isEmpty: 1.00 (OK)
vertexCount: 0.53 (good)
hasVertex: 0.76 (good)
edgeCount: 0.53 (good)
edgeList: 0.52 (good)
hasEdge: 0.42 (good)
hasSelfLoop: 0.47 (good)
addEdge: 0.64 (good)
addVertex: 0.75 (good)
removeVertex: 0.48 (good)
equality: 0.43 (good)
removeEdge: 0.42 (good)
transpose: 0.60 (good)
reachable: 0.67 (good)
edges: 12.66 (bad)
It is 10 times slower than a regular edges
for every type that is an instance of Ord
, and corrected the previous problem with hasVertex
.
For Int
vertices, it is only 7 times slower.
Code is here: https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L314
I will investigate further improvements :)
I made some further improvements (https://github.com/nobrakal/alga/blob/edgesOrd/src/Algebra/Graph.hs#L312)
With a general Ord
constraint
edges: 6.94 (bad)
Specialized to Int
edges: 4.05 (bad)
size
is also smaller:
> let x = edgeList $ clique [1..1000] :: [(Int,Int)]
> size $ edgesOrd x
250501
@nobrakal Many thanks for ongoing experiments!
The latest solution gives around 50% speed up across API, but it's quite complex and doesn't seem to provide any useful guarantees/properties for resulting expressions. We could, of course, include it into the API with a comment "This function is around 5x slower than edges
, but if you use it to construct graphs, all other functions are probably going to be 2x times faster", but this seems a bit vague and ad hoc.
If we are prepared to pay a lot in terms of performance and complexity, perhaps, it is worth looking into a more principled approach, for example, based on modular graph decomposition? Then we will at least be able to provide certain guarantees about the result, for example, that all graph modules have been collapsed into sub-expressions.
Of course, it's tempting to add edgesOrd
to improve Alga's standing in the benchmarks suite, but will Alga users actually want to construct graphs using edgesOrd
? Maybe they will prefer to construct graphs algebraically, which gives direct control over the resulting graph structure.
P.S.: By the way, I guess you could use the star
combinator in a few places in your implementation to get some simplification.
Perhaps, edgesOrd
should actually be called stars
and take a list [(a, [a])]
as input. Then there may be a separate function somewhere in Algebra.Graph.Utilities
that turns a list of edges into a list of adjacency lists. This will make the API simpler to understand and explain.
(But your latest implementation may be doing more than just stars
underneath!).
provide any useful guarantees/properties for resulting expressions
Yep, we only know that size (edgesOrd x) <= length x + nbVertices x
. So when possible, it is better than edgesEq
.
Note that the previous edgesEq
worked only with sorted list of edges (Since I did all my tests with edgesList
to produce a list of edges...). So we need an Ord
instance anyway, to sort things :)
but will Alga users actually want to construct graphs using edgesOrd?
When building a graph with Int
vertices, or something with an Ord
instance not too expensive, it can certainly be interesting. Since actual edges
is extremely fast, one can chose to spend some time to build a better graph for later use (and edgesOrd
is faster than FGL edges
equivalent anyway).
Maybe they will prefer to construct graphs algebraically, which gives direct control over the resulting graph structure.
Obviously this is the better thing to do :)
based on modular graph decomposition?
It certainly worth a try, but as you said this a complex thing, and the implementation has to be a bit optimized if don't want to take many time to build a graph !
edgesOrd should actually be called stars
I build a list of (a,Set a)
(NonEmpty Set to be more precise), so I can't use such a function, and having starsOrd
does not seems very useful...
I build a list of
(a,Set a)
(NonEmpty Set
to be more precise), so I can't use such a function, and havingstarsOrd
does not seems very useful...
I don't understand. Let's assume we add the following function to the API:
stars :: [(a, [a])] -> Graph a
stars = overlays . map (uncurry star)
Then, given xs :: [(a, Set a)]
you can call call stars
as stars . map (\(x, s) -> (x, toList s))
.
Ah this way, it is ok !
I just need to replace overlays
by something like overlays1
, without the Empty
leaf at the end :)
Sadly, using stars
drop a bit performances:
edges: 10.25 (bad)
(As a reminder, without stars
(so with graphs constructed in place:
edges: 8.65 (bad)
)
@nobrakal Strange. Presumably, you should have exactly the same code in your implementation, since you are getting a bunch of stars in the end, isn't it? Could you explain the slowdown?
You are right. I have reworked it to use non-empty lists when possible and thus avoiding a pattern match on the empty list, and I dropped the result to a:
edges: 9.48 (bad)
What left of the slowdown comes I think to an optimization that does not happen in these intricate mix of foldr/map
What left of the slowdown comes I think to an optimization that does not happen in these intricate mix of foldr/map
I would expect everything to be the same if stars
is inlined. It would be nice to eliminate any overhead left due to having a separate function.
I investigated about poor results of
alga
in https://github.com/haskell-perf/graphsI found that the
edges
function fromAlgebraic.Graph
is a bit naive:This can lead to very poor graph representation: For example with a
clique [0..9]
Doing a
edges $ edgeList $ clique [0..9]
will produce:This representation can lead to very poor performance on big graphs.
I just tried a little amelioration, converting the list
[(a,a)]
into an adjacency mapMap a [a]
, and then producing a graph usingfoldrWithKey (\k v acc -> Overlay acc $ Connect (Vertex k) $ vertices v)
.Only this little trick divided the running-time of function like
hasEdge
by two on big graphs. I am sure finding a better implementation foredges
can improvealga
times a lot.