snowleopard / alga

Algebraic graphs
MIT License
719 stars 68 forks source link

Move benchmarks to haskell-perf/graphs #14

Closed snowleopard closed 6 years ago

snowleopard commented 7 years ago

UPDATE: We moved benchmarks to a separate repository to keep this package lightweight and fast to build, see some comments below.


Current benchmarks are a mess: https://github.com/snowleopard/alga/blob/master/doc/benchmarks.md

snowleopard commented 6 years ago

@nobrakal 6.625 ms for a graph with 10000 vertices and edges feels slow-ish. Can you add a comparison with the containers library?

nobrakal commented 6 years ago

I started an implementation for containers but it doesn't provide a hasEdge or similar. I could write one myself but I am not sure it will be very efficient...

I will try soon with fgl :)

nobrakal commented 6 years ago

Created a gist with results here: https://gist.github.com/nobrakal/9acc8f86554fea3d34187cfafec0978c

Note that hasEdge/cycle10000/(-1,2) is now 12ms, but the benchmarks ran on my laptop (basically a tablet) which more less powerful than my personal computer where the benchmarks had run the past week (I don't now if is important, but it can explain simply this problem...).

For a side note, fgl's PatriciaTree took 125ns on the same machine on the same operation. So I think that my library is not working...

I will investigate !

snowleopard commented 6 years ago

@nobrakal I think this looks fine: fgl probably supports hasEdge of O(log(n+m)) complexity, whereas in Alga the complexity is O(s): we don't annotate the graph construction tree with any additional information, so you need to explore the whole tree to answer the hasEdge query :)

With containers you probably can respond in O(m/n) time on average, i.e. you will need to traverse the whole adjacency list of a particular source vertex. However, the worst case complexity could be O(n) for a highly-connected vertex.

We need to benchmark hasEdge on dense graphs too.

nobrakal commented 6 years ago

I follow my path in https://github.com/nobrakal/bench-graph

Proof-of-concept are working for alga, fgl and containers. I am now trying to allow comparison over the benchmarks (and to produce a readable output from all this benchmarks) (soon available through cabal bench compare).

Concerning generic graphs, I have implemented Path, Circuit and Complete. I am thinking about a way to have good data on them. For know, for hasEdge, I am only benchmarking the presence of (-1,1) and (0,2) in each graphs. I want to have a list of "representative edges" on each graphs (existing, at the start of the graphs, deep-buried in it, or non-existing).

snowleopard commented 6 years ago

I am now trying to allow comparison over the benchmarks (and to produce a readable output from all this benchmarks) (soon available through cabal bench compare).

@nobrakal Please run this on CI and give a link to the CI result from README -- then it would be always easy to see the latest comparison results.

I want to have a list of "representative edges" on each graphs (existing, at the start of the graphs, deep-buried in it, or non-existing).

Are you talking about what I called 'edge access pattern' and 'non-existing edge access pattern' in one of the early comments?

nobrakal commented 6 years ago

Please run this on CI and give a link to the CI result from README -- then it would be always easy to see the latest comparison results.

Done. For now the output is not really nice, but it allows quick comparison, and data are just here, hidden somewhere in /tmp

Are you talking about what I called 'edge access pattern' and 'non-existing edge access pattern' in one of the early comments?

Yes exactly :) The problem is to choose which edge to take for a pattern, because for big graphs, theses patterns can represent a lot of edges !

snowleopard commented 6 years ago

@nobrakal Awesome! That's already pretty interesting/informative -- lots of optimisations to do for Alga ;)

The problem is to choose which edge to take for a pattern

Sure, at some point we'll need to think about this more carefully, but I think at first the priority is to make sure we have a robust and convenient benchmarking infrastructure. You're on the right track 👍

nobrakal commented 6 years ago

The problem is in fact the infrastructure ^^.

I have reworked it: it expose now a function:

benchOver :: (GraphImpl g, NFData g) => [(GenericGraph,[Int])] -> ToFuncToBench g -> Benchmark

That will produce a Benchmark of a function tested over the GenericGraph passed.

The idea is to define standard graphs (like Complete, Path and Circuit) to be passed, and to map benchOver over a [ToFuncToBench a]

This result for alga something like:

isEmpty' :: ToFuncToBench (Graph Int)
isEmpty' = createConsumer "isEmpty" isEmpty

hasEdge' :: ToFuncToBench (Graph Int)
hasEdge' = ToFuncToBench "hasEdge (not in graph)" $ FuncWithArg (uncurry hasEdge) show . take 2 . edgesNotInGraph

allBenchs :: [Benchmark]
allBenchs = map (benchOver graphs)  [hasEdge', isEmpty']
  where
  graphs = [
    (path, take 5 tenPowers),
    (circuit, take 5 tenPowers),
    (complete, take 3 tenPowers)
    ]

Concerning 'non edge access pattern', for now, I use list difference between the complete graph and the actual one.

The next steps are to support disconnected graphs, make a better use of criterion, and start an approach to benchmark graph creation.

snowleopard commented 6 years ago

@nobrakal I've sent a PR (https://github.com/nobrakal/bench-graph/pull/1) trying to simplify the infrastructure. Perhaps, we'll be able to simplify it even further.

nobrakal commented 6 years ago

@snowleopard @aloiscochard The https://github.com/haskell-perf/graphs repo is now living :)

I suggest to move this discussion over there, since it doesn't concern only alga. For note, I added support for hash-graph !

snowleopard commented 6 years ago

@nobrakal Awesome! Yes, let's continue discussions there.

I've renamed this issue and will remove the current benchmarking code from Alga -- it will become much faster to build.

jmatsushita commented 3 years ago

May I suggest updating the original description to point to https://github.com/haskell-perf/graphs ? It was interesting to read through the whole issue but maybe pointing to the results up top makes sense?

For bonus points the issue could be renamed "Move benchmarks to haskell-perf/graphs" 😁

snowleopard commented 3 years ago

@jmatsushita Agreed! Done, and with a bonus point :)