haskell / containers

Assorted concrete container types
https://hackage.haskell.org/package/containers
316 stars 178 forks source link

possible performance benchmark for IntMap #209

Open jwaldmann opened 8 years ago

jwaldmann commented 8 years ago

Dear maintainers,

I have an application that could be made into a performance benchmark for IntMap. It is similar to what fgl (functional graph library) does, but for a specific use case.

Source: https://gitlab.imn.htwk-leipzig.de/waldmann/pure-matchbox , Explanation (of IntMap usage): https://gitlab.imn.htwk-leipzig.de/waldmann/pure-matchbox/blob/master/paper/rel.tex

To make it "more stand-alone", I should cut out all the parsing/(pretty)printing stuff which pulls in a lot of dependencies, and takes ages to compile. Perhaps this can be done via conditionals in the cabal file.

treeowl commented 8 years ago

More benchmarks could definitely be useful. The current benchmarks are all microbenchmarks. What makes your application a particularly good real-world test? I think I would prefer to keep large benchmarks in a separate git repo to keep the size from ballooning entirely out of control. Do you think that manageable? On Apr 26, 2016 7:07 AM, "jwaldmann" notifications@github.com wrote:

Dear maintainers,

I have an application that could be made into a performance benchmark for IntMap.

Source: https://gitlab.imn.htwk-leipzig.de/waldmann/pure-matchbox , Explanation (of IntMap usage): https://gitlab.imn.htwk-leipzig.de/waldmann/pure-matchbox/blob/master/paper/rel.tex

To make it "more stand-alone", I should cut out all the parsing/(pretty)printing stuff which pulls in a lot of dependencies, and takes ages to compile. Perhaps this can be done via conditionals in the cabal file.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/haskell/containers/issues/209

jwaldmann commented 8 years ago

The benchmark code size is small. The benchmark compile time can be made small (if dependecies are stripped down). The benchmark execution time can be parameterized (from a few seconds to a few minutes). The benchmark is real-world because I use the code in a termination analyzer http://www.termination-portal.org/wiki/Tools:Matchbox (not exactly this version). Execution consists of IntMap updates and intersects.

wrengr commented 8 years ago

@treeowl if we do accept this benchmark (whether in this repo or in a separate one), it may help for comparing the current big-endian patricia trie based IntMap with an array mapped trie variant I'm considering adding in the fall. (AMTs are essentially the same as the HAMTs of unordered-containers except without the hashing, and thus preserving the ordering.) I'd definitely like to have some larger benchmarks to compare the two implementations, since the microbenchmarks are likely to artificially favor one over the other.

treeowl commented 8 years ago

I don't know enough about IntMap to have an opinion about major implementation changes. I'm all for more benchmarks--the more the better, up to a point. I just want to make sure any large ones get a separate repo, since git thingumjiggers never shrink.

On Wed, Apr 27, 2016 at 5:32 PM, wren romano notifications@github.com wrote:

@treeowl https://github.com/treeowl if we do accept this benchmark (whether in this repo or in a separate one), it may help for comparing the current big-endian patricia trie based IntMap with an array mapped trie variant I'm considering adding in the fall. (AMTs are essentially the same as the HAMTs of unordered-containers except without the hashing, and thus preserving the ordering.) I'd definitely like to have some larger benchmarks to compare the two implementations, since the microbenchmarks are likely to artificially favor one over the other.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/haskell/containers/issues/209#issuecomment-215236195

jwaldmann commented 8 years ago

I removed extra dependencies, now this is a nice small benchmark: https://gitlab.imn.htwk-leipzig.de/waldmann/containers-benchmark