Closed snowleopard closed 6 years ago
@nobrakal 6.625 ms
for a graph with 10000 vertices and edges feels slow-ish. Can you add a comparison with the containers
library?
I started an implementation for containers
but it doesn't provide a hasEdge
or similar. I could write one myself but I am not sure it will be very efficient...
I will try soon with fgl :)
Created a gist with results here: https://gist.github.com/nobrakal/9acc8f86554fea3d34187cfafec0978c
Note that hasEdge/cycle10000/(-1,2)
is now 12ms
, but the benchmarks ran on my laptop (basically a tablet) which more less powerful than my personal computer where the benchmarks had run the past week (I don't now if is important, but it can explain simply this problem...).
For a side note, fgl
's PatriciaTree
took 125ns
on the same machine on the same operation. So I think that my library is not working...
I will investigate !
@nobrakal I think this looks fine: fgl
probably supports hasEdge
of O(log(n+m))
complexity, whereas in Alga the complexity is O(s)
: we don't annotate the graph construction tree with any additional information, so you need to explore the whole tree to answer the hasEdge
query :)
With containers
you probably can respond in O(m/n)
time on average, i.e. you will need to traverse the whole adjacency list of a particular source vertex. However, the worst case complexity could be O(n)
for a highly-connected vertex.
We need to benchmark hasEdge
on dense graphs too.
I follow my path in https://github.com/nobrakal/bench-graph
Proof-of-concept are working for alga
, fgl
and containers
. I am now trying to allow comparison over the benchmarks (and to produce a readable output from all this benchmarks) (soon available through cabal bench compare
).
Concerning generic graphs, I have implemented Path
, Circuit
and Complete
. I am thinking about a way to have good data on them. For know, for hasEdge, I am only benchmarking the presence of (-1,1) and (0,2) in each graphs. I want to have a list of "representative edges" on each graphs (existing, at the start of the graphs, deep-buried in it, or non-existing).
I am now trying to allow comparison over the benchmarks (and to produce a readable output from all this benchmarks) (soon available through
cabal bench compare
).
@nobrakal Please run this on CI and give a link to the CI result from README -- then it would be always easy to see the latest comparison results.
I want to have a list of "representative edges" on each graphs (existing, at the start of the graphs, deep-buried in it, or non-existing).
Are you talking about what I called 'edge access pattern' and 'non-existing edge access pattern' in one of the early comments?
Please run this on CI and give a link to the CI result from README -- then it would be always easy to see the latest comparison results.
Done. For now the output is not really nice, but it allows quick comparison, and data are just here, hidden somewhere in /tmp
Are you talking about what I called 'edge access pattern' and 'non-existing edge access pattern' in one of the early comments?
Yes exactly :) The problem is to choose which edge to take for a pattern, because for big graphs, theses patterns can represent a lot of edges !
@nobrakal Awesome! That's already pretty interesting/informative -- lots of optimisations to do for Alga ;)
The problem is to choose which edge to take for a pattern
Sure, at some point we'll need to think about this more carefully, but I think at first the priority is to make sure we have a robust and convenient benchmarking infrastructure. You're on the right track 👍
The problem is in fact the infrastructure ^^.
I have reworked it: it expose now a function:
benchOver :: (GraphImpl g, NFData g) => [(GenericGraph,[Int])] -> ToFuncToBench g -> Benchmark
That will produce a Benchmark of a function tested over the GenericGraph
passed.
The idea is to define standard graphs (like Complete
, Path
and Circuit
) to be passed, and to map benchOver
over a [ToFuncToBench a]
This result for alga something like:
isEmpty' :: ToFuncToBench (Graph Int)
isEmpty' = createConsumer "isEmpty" isEmpty
hasEdge' :: ToFuncToBench (Graph Int)
hasEdge' = ToFuncToBench "hasEdge (not in graph)" $ FuncWithArg (uncurry hasEdge) show . take 2 . edgesNotInGraph
allBenchs :: [Benchmark]
allBenchs = map (benchOver graphs) [hasEdge', isEmpty']
where
graphs = [
(path, take 5 tenPowers),
(circuit, take 5 tenPowers),
(complete, take 3 tenPowers)
]
Concerning 'non edge access pattern', for now, I use list difference between the complete graph and the actual one.
The next steps are to support disconnected graphs, make a better use of criterion, and start an approach to benchmark graph creation.
@nobrakal I've sent a PR (https://github.com/nobrakal/bench-graph/pull/1) trying to simplify the infrastructure. Perhaps, we'll be able to simplify it even further.
@snowleopard @aloiscochard The https://github.com/haskell-perf/graphs repo is now living :)
I suggest to move this discussion over there, since it doesn't concern only alga.
For note, I added support for hash-graph
!
@nobrakal Awesome! Yes, let's continue discussions there.
I've renamed this issue and will remove the current benchmarking code from Alga -- it will become much faster to build.
May I suggest updating the original description to point to https://github.com/haskell-perf/graphs ? It was interesting to read through the whole issue but maybe pointing to the results up top makes sense?
For bonus points the issue could be renamed "Move benchmarks to haskell-perf/graphs" 😁
@jmatsushita Agreed! Done, and with a bonus point :)
UPDATE: We moved benchmarks to a separate repository to keep this package lightweight and fast to build, see some comments below.
Current benchmarks are a mess: https://github.com/snowleopard/alga/blob/master/doc/benchmarks.md