pgujjula / apply-merge

Lift a binary, non-decreasing function onto ordered lists and order the output
8 stars 1 forks source link

Investigate performance of `applyMerge min [1..] [1..]` #16

Open pgujjula opened 1 month ago

pgujjula commented 1 month ago

applyMerge min [1..] [1..] has the following shape:

. . . . . . . . . .
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *
. * * * * * * * * *

which I thought should result in poor performance with MergeAll.applyMerge, but benchmarks show that the performance is pretty good:

    double linear shape: applyMerge min [1..] [1..]                                                                                                                                                                                                                                  16:49:18 [66/3316]
      10
        DoublyLinkedList:   OK (0.17s)
          1.16 μs ±  89 ns
        IntMap:             OK (0.14s)
          962  ns ±  82 ns
        IntSet:             OK (0.13s)
          939  ns ±  90 ns
        MergeAll:           OK (0.13s)
          480  ns ±  48 ns
        MergeAll (flipped): OK (0.26s)
          467  ns ±  25 ns
      100
        DoublyLinkedList:   OK (0.19s)
          11.1 μs ± 685 ns
        IntMap:             OK (0.17s)
          9.74 μs ± 687 ns
        IntSet:             OK (0.17s)
          9.68 μs ± 776 ns
        MergeAll:           OK (0.26s)
          3.76 μs ± 165 ns
        MergeAll (flipped): OK (0.13s)
          3.70 μs ± 330 ns
      1000
        DoublyLinkedList:   OK (0.24s)
          113  μs ± 5.6 μs
        IntMap:             OK (0.21s)
          99.3 μs ± 5.9 μs
        IntSet:             OK (0.21s)
          100  μs ± 6.9 μs
        MergeAll:           OK (0.31s)
          37.1 μs ± 3.5 μs
        MergeAll (flipped): OK (0.32s)
          37.3 μs ± 2.9 μs
      10000
        DoublyLinkedList:   OK (0.29s)
          2.09 ms ± 103 μs
        IntMap:             OK (0.21s)
          1.52 ms ± 139 μs
        IntSet:             OK (1.48s)
          1.41 ms ±  35 μs
        MergeAll:           OK (0.24s)
          422  μs ±  30 μs
        MergeAll (flipped): OK (0.23s)
          413  μs ±  21 μs
      100000
        DoublyLinkedList:   OK (0.40s)
          23.2 ms ± 1.8 ms
        IntMap:             OK (0.25s)
          15.0 ms ± 852 μs
        IntSet:             OK (0.25s)
          14.8 ms ± 1.5 ms
        MergeAll:           OK (0.27s)
          7.97 ms ± 681 μs
        MergeAll (flipped): OK (0.26s)
          7.67 ms ± 576 μs
      1000000
        DoublyLinkedList:   OK (0.85s)
          258  ms ± 9.1 ms
        IntMap:             OK (1.22s)
          156  ms ± 9.4 ms
        IntSet:             OK (1.17s)
          155  ms ± 6.4 ms
        MergeAll:           OK (0.68s)
          87.7 ms ± 5.1 ms
        MergeAll (flipped): OK (0.69s)
          87.8 ms ± 5.2 ms

On the other hand, applyMerge max [1..] [1..] has the following shape:

. . . . . * * * * *
. . . . . * * * * *
. . . . . * * * * *
. . . . . * * * * *
. . . . . * * * * *
* * * * * * * * * *
* * * * * * * * * *
* * * * * * * * * *
* * * * * * * * * *
* * * * * * * * * *

Here the situation is reversed: I expected poor performance with MergeAll.applyMerge, but benchmarks show poor performance.

    box shape: applyMerge max [1..] [1..]
      10
        DoublyLinkedList:   OK (0.21s)
          1.40 μs ±  84 ns
        IntMap:             OK (0.17s)
          1.15 μs ±  91 ns
        IntSet:             OK (0.31s)
          1.08 μs ±  48 ns
        MergeAll:           OK (0.25s)
          883  ns ±  61 ns
        MergeAll (flipped): OK (0.14s)
          943  ns ±  83 ns
      100
        DoublyLinkedList:   OK (0.27s)
          15.2 μs ± 921 ns
        IntMap:             OK (0.24s)
          13.2 μs ± 1.0 μs
        IntSet:             OK (0.21s)
          11.3 μs ± 678 ns
        MergeAll:           OK (0.21s)
          11.2 μs ± 659 ns
        MergeAll (flipped): OK (0.21s)
          11.3 μs ± 769 ns
      1000
        DoublyLinkedList:   OK (0.16s)
          136  μs ±  11 μs
        IntMap:             OK (0.15s)
          131  μs ±  11 μs
        IntSet:             OK (0.25s)
          107  μs ± 5.4 μs
        MergeAll:           OK (0.18s)
          156  μs ±  12 μs
        MergeAll (flipped): OK (0.18s)
          159  μs ±  12 μs
      10000
        DoublyLinkedList:   OK (0.19s)
          1.36 ms ± 135 μs
        IntMap:             OK (0.18s)
          1.31 ms ±  97 μs
        IntSet:             OK (0.16s)
          1.20 ms ±  88 μs
        MergeAll:           OK (0.25s)
          1.93 ms ± 185 μs
        MergeAll (flipped): OK (0.26s)
          1.94 ms ± 185 μs
      100000
        DoublyLinkedList:   OK (0.21s)
          13.4 ms ± 1.0 ms
        IntMap:             OK (0.21s)
          12.9 ms ± 707 μs
        IntSet:             OK (0.20s)
          12.7 ms ± 959 μs
        MergeAll:           OK (0.21s)
          28.6 ms ± 2.0 ms
        MergeAll (flipped): OK (0.46s)
          28.7 ms ± 730 μs
      1000000
        DoublyLinkedList:   OK (0.42s)
          134  ms ± 3.8 ms
        IntMap:             OK (0.41s)
          128  ms ± 5.8 ms
        IntSet:             OK (0.42s)
          128  ms ± 2.7 ms
        MergeAll:           OK (1.05s)
          327  ms ± 8.3 ms
        MergeAll (flipped): OK (1.06s)
          327  ms ± 6.6 ms