haskell-unordered-containers / unordered-containers

Efficient hashing-based container types
BSD 3-Clause "New" or "Revised" License
221 stars 99 forks source link

Better benchmarks for `intersection` (and better `intersectionArraysBy`) #416

Open sjakobi opened 2 years ago

sjakobi commented 2 years ago

Here are the intersection benchmarks as of #406:

https://github.com/haskell-unordered-containers/unordered-containers/blob/d24cc1f48cef1c7f6bc041e7e73ee40a1fc7b731/benchmarks/Benchmarks.hs#L322-L325

https://github.com/haskell-unordered-containers/unordered-containers/blob/d24cc1f48cef1c7f6bc041e7e73ee40a1fc7b731/benchmarks/Benchmarks.hs#L84-L88

https://github.com/haskell-unordered-containers/unordered-containers/blob/d24cc1f48cef1c7f6bc041e7e73ee40a1fc7b731/benchmarks/Benchmarks.hs#L103-L107

https://github.com/haskell-unordered-containers/unordered-containers/blob/d24cc1f48cef1c7f6bc041e7e73ee40a1fc7b731/benchmarks/Benchmarks.hs#L120-L123

I think the maps in these benchmarks have an unusually large overlap – there are relatively few elements that do not lie in the intersection. It would be interesting to have benchmarks where the intersection is relatively smaller.

With such benchmarks it may also turn out that it is faster to iterate only over the intersections of the array nodes in intersectionArrayBy, as implemented in https://github.com/oberblastmeister/unordered-containers/commit/1ee67eacdeae21b16b95e9ee198ba3a9fef45250.