haskell / containers

Assorted concrete container types
https://hackage.haskell.org/package/containers
314 stars 177 forks source link

Optimize IntSet.Bin #998

Closed meooow25 closed 1 month ago

meooow25 commented 3 months ago

Fixes #991.

Memory

Concretely, this reduces the memory required by an IntSet by ~12.5%.

Calculations: For a tree with n Tips, each Tips costs 3 words and there are n-1 Bins each costing 5 words before this change and 4 words after. So we save about 1 out of 8 words.

Compatibility

This PR makes breaking changes to IntSet internals which is reflected in the exports of the internal module Data.IntSet.Internal. Due to the introduction of IntTreeCommons, some exports of Data.IntMap.Internal are moved to IntTreeCommons. There is no change in the exports of any other module.

Benchmarks

Benchmarks done with GHC 9.6.3. Last updated: 080dde1

Benchmark command: cabal run <target> -- --csv <csv> +RTS -T

intset-benchmarks

Name                       Time - - - - - - - -    Allocated - - - - -     Copied - - - - - - -
                                A       B     %         A       B     %         A       B     %
delete                      66 μs   65 μs   -1%    729 KB  601 KB  -17%    127 B   104 B   -18%
deleteMax                   17 ns   16 ns   -5%    175 B   159 B    -9%      0 B     0 B
deleteMin                   57 ns   51 ns  -11%    764 B   647 B   -15%      0 B     0 B
difference                 721 ns  683 ns   -5%    4.0 KB  3.5 KB  -12%      1 B     2 B   +100%
disjoint:false              33 ns   34 ns   +3%     23 B    23 B    +0%      0 B     0 B
disjoint:true              443 ns  409 ns   -7%      0 B    12 B             0 B     0 B
filter                      21 μs   21 μs   +0%     68 KB   67 KB   -1%     43 B    39 B    -9%
findMax                    6.8 ns  7.1 ns   +4%     39 B    39 B    +0%      0 B     0 B
findMin                    9.1 ns  9.0 ns   +0%     39 B    39 B    +0%      0 B     0 B
fold                       154 ns  156 ns   +0%    2.7 KB  2.7 KB   +0%      1 B     1 B    +0%
fromAscList                 17 μs   17 μs   +0%    4.0 KB  2.8 KB  -29%      2 B     2 B    +0%
fromDistinctAscList         17 μs   17 μs   +0%    4.0 KB  2.8 KB  -29%      3 B     2 B   -33%
fromList                    52 μs   46 μs  -11%    576 KB  479 KB  -16%    392 B   297 B   -24%
fromRange                  347 ns  332 ns   -4%    4.0 KB  3.6 KB  -12%      2 B     1 B   -50%
fromRange:small            9.2 ns   11 ns  +19%    119 B   111 B    -6%      0 B     0 B
insert                      52 μs   45 μs  -12%    576 KB  479 KB  -16%    409 B   308 B   -24%
instanceOrd:dense          454 ms  454 ms   +0%    2.2 GB  2.2 GB   +0%    142 MB  141 MB   +0%
instanceOrd:sparse         826 ms  818 ms   +0%    2.4 GB  2.4 GB   +0%    241 MB  229 MB   -4%
intersection               710 ns  702 ns   -1%    4.0 KB  3.5 KB  -13%      2 B     1 B   -50%
map                         85 μs   81 μs   -5%    1.0 MB  960 KB   -9%    1.1 KB  912 B   -16%
member                      50 μs   48 μs   -2%      0 B     0 B             0 B     0 B
null.intersection:false    706 ns  699 ns   +0%    4.0 KB  3.5 KB  -13%      2 B     1 B   -50%
null.intersection:true     425 ns  420 ns   -1%      0 B    12 B             0 B     0 B
partition                   24 μs   22 μs   -4%     72 KB   71 KB   -1%     79 B    69 B   -12%
spanAntitone:dense         118 ns  112 ns   -4%    581 B   527 B    -9%      0 B     0 B
spanAntitone:sparse        137 ns  120 ns  -12%    772 B   677 B   -12%      0 B     0 B
split:dense                 36 ns   32 ns  -10%    375 B   319 B   -14%      0 B     0 B
split:sparse                49 ns   48 ns   -1%    486 B   399 B   -17%      0 B     0 B
splitMember:dense           38 ns   35 ns   -7%    383 B   326 B   -14%      0 B     0 B
splitMember:sparse          56 ns   44 ns  -20%    494 B   406 B   -17%      0 B     0 B
union                      691 ns  662 ns   -4%    4.0 KB  3.5 KB  -12%      2 B     1 B   -50%
unions                     694 ns  677 ns   -2%    4.1 KB  3.6 KB  -12%      2 B     1 B   -50%

set-operations-intset

Name                           Time - - - - - - - -    Allocated - - - - -     Copied - - - - - - -
                                    A       B     %         A       B     %         A       B     %
difference-block_nn             12 μs   11 μs   -1%     65 KB   54 KB  -16%    530 B   388 B   -26%
difference-block_nn_swap        12 μs   12 μs   +0%     65 KB   54 KB  -16%    552 B   394 B   -28%
difference-block_ns            1.6 μs  1.5 μs   -3%     11 KB  8.8 KB  -17%     17 B    11 B   -35%
difference-block_sn_swap       1.3 μs  1.3 μs   +1%    6.5 KB  5.5 KB  -15%      7 B     5 B   -28%
difference-common_nn            18 μs   17 μs   -3%     98 KB   85 KB  -12%    1.2 KB  925 B   -23%
difference-common_nn_swap       11 μs  9.7 μs  -15%      0 B     0 B             0 B     0 B
difference-common_ns            18 μs   18 μs   -3%     97 KB   85 KB  -12%    1.2 KB  932 B   -23%
difference-common_nt           6.4 μs  6.0 μs   -6%     47 KB   39 KB  -17%    291 B   202 B   -30%
difference-common_sn_swap       11 μs  9.7 μs  -15%      0 B     0 B             0 B     0 B
difference-common_tn_swap      3.2 μs  2.8 μs  -11%      0 B     0 B             0 B     0 B
difference-disj_nn              47 ns   46 ns   -2%    255 B   214 B   -16%      0 B     0 B
difference-disj_nn_swap         51 ns   49 ns   -3%    334 B   278 B   -16%      0 B     0 B
difference-disj_ns              43 ns   41 ns   -4%    255 B   214 B   -16%      0 B     0 B
difference-disj_nt              38 ns   35 ns   -8%    255 B   214 B   -16%      0 B     0 B
difference-disj_sn_swap         44 ns   43 ns   -2%    255 B   214 B   -16%      0 B     0 B
difference-disj_tn_swap         30 ns   34 ns  +14%    135 B   119 B   -11%      0 B     0 B
difference-mix_nn               38 μs   36 μs   -5%    193 KB  169 KB  -12%    4.9 KB  3.7 KB  -23%
difference-mix_nn_swap          38 μs   36 μs   -4%    193 KB  169 KB  -12%    4.8 KB  3.7 KB  -22%
difference-mix_ns               20 μs   19 μs   -4%    107 KB   94 KB  -12%    1.4 KB  1.1 KB  -22%
difference-mix_nt              6.5 μs  6.1 μs   -6%     47 KB   39 KB  -17%    290 B   204 B   -29%
difference-mix_sn_swap          20 μs   19 μs   -5%    107 KB   94 KB  -12%    1.4 KB  1.1 KB  -23%
difference-mix_tn_swap         4.4 μs  4.1 μs   -7%     20 KB   17 KB  -12%     54 B    39 B   -27%
intersection-block_nn          7.2 μs  6.9 μs   -5%      0 B     0 B             0 B     0 B
intersection-block_nn_swap     7.0 μs  6.9 μs   -1%      0 B     0 B             0 B     0 B
intersection-block_ns          891 ns  879 ns   -1%     25 B    25 B    +0%      0 B     0 B
intersection-block_sn_swap     962 ns  864 ns  -10%      0 B    25 B             0 B     0 B
intersection-common_nn          18 μs   17 μs   -2%     97 KB   85 KB  -12%    1.2 KB  904 B   -24%
intersection-common_nn_swap     18 μs   18 μs   +0%     97 KB   85 KB  -11%    1.2 KB  931 B   -21%
intersection-common_ns          18 μs   18 μs   +0%     97 KB   85 KB  -12%    1.1 KB  920 B   -21%
intersection-common_nt         4.6 μs  4.4 μs   -2%     19 KB   17 KB  -11%     50 B    38 B   -24%
intersection-common_sn_swap     18 μs   17 μs   -1%     97 KB   85 KB  -12%    1.2 KB  898 B   -24%
intersection-common_tn_swap    4.3 μs  4.3 μs   +0%     19 KB   17 KB  -11%     50 B    40 B   -19%
intersection-disj_nn            33 ns   30 ns   -9%     31 B    31 B    +0%      0 B     0 B
intersection-disj_nn_swap       32 ns   32 ns   +0%     31 B    31 B    +0%      0 B     0 B
intersection-disj_ns            30 ns   26 ns  -13%     31 B    31 B    +0%      0 B     0 B
intersection-disj_nt            26 ns   23 ns   -8%     31 B    31 B    +0%      0 B     0 B
intersection-disj_sn_swap       29 ns   28 ns   -3%     31 B    31 B    +0%      0 B     0 B
intersection-disj_tn_swap       25 ns   24 ns   -5%     31 B    31 B    +0%      0 B     0 B
intersection-mix_nn             22 μs   21 μs   -2%      0 B     0 B             0 B     0 B
intersection-mix_nn_swap        22 μs   21 μs   -1%      0 B     0 B             0 B     0 B
intersection-mix_ns             12 μs   12 μs   +0%      0 B     0 B             0 B     0 B
intersection-mix_nt            3.2 μs  3.3 μs   +1%      0 B     0 B             0 B     0 B
intersection-mix_sn_swap        12 μs   12 μs   +0%      0 B     0 B             0 B     0 B
intersection-mix_tn_swap       3.1 μs  3.0 μs   -3%      0 B     0 B             0 B     0 B
union-block_nn                  14 μs   13 μs   -4%     93 KB   77 KB  -16%    1.1 KB  776 B   -30%
union-block_nn_swap             14 μs   13 μs   -6%     93 KB   77 KB  -16%    1.1 KB  777 B   -29%
union-block_ns                 1.8 μs  1.7 μs   -7%     13 KB   11 KB  -17%     27 B    18 B   -33%
union-block_sn_swap            1.9 μs  1.7 μs   -9%     13 KB   11 KB  -17%     27 B    18 B   -33%
union-common_nn                 17 μs   17 μs   +1%     97 KB   85 KB  -12%    1.2 KB  965 B   -20%
union-common_nn_swap            17 μs   17 μs   +0%     97 KB   85 KB  -12%    1.2 KB  934 B   -24%
union-common_ns                 17 μs   18 μs   +1%     97 KB   85 KB  -11%    1.2 KB  969 B   -20%
union-common_nt                6.4 μs  5.7 μs   -9%     47 KB   39 KB  -16%    309 B   203 B   -34%
union-common_sn_swap            17 μs   17 μs   +0%     97 KB   85 KB  -12%    1.2 KB  942 B   -22%
union-common_tn_swap           5.9 μs  5.3 μs   -9%     47 KB   39 KB  -16%    301 B   206 B   -31%
union-disj_nn                   62 ns   47 ns  -24%    534 B   438 B   -17%      0 B     0 B
union-disj_nn_swap              61 ns   54 ns  -10%    534 B   438 B   -17%      0 B     0 B
union-disj_ns                   56 ns   41 ns  -26%    454 B   375 B   -17%      0 B     0 B
union-disj_nt                   44 ns   32 ns  -28%    334 B   279 B   -16%      0 B     0 B
union-disj_sn_swap              56 ns   48 ns  -14%    454 B   375 B   -17%      0 B     0 B
union-disj_tn_swap              47 ns   38 ns  -20%    334 B   279 B   -16%      0 B     0 B
union-mix_nn                    36 μs   37 μs   +1%    193 KB  171 KB  -11%    4.7 KB  3.7 KB  -21%
union-mix_nn_swap               36 μs   37 μs   +3%    193 KB  171 KB  -11%    4.7 KB  3.7 KB  -21%
union-mix_ns                    19 μs   20 μs   +2%    107 KB   93 KB  -12%    1.4 KB  1.1 KB  -21%
union-mix_nt                   6.4 μs  5.8 μs   -9%     47 KB   39 KB  -16%    290 B   205 B   -29%
union-mix_sn_swap               19 μs   19 μs   +1%    107 KB   94 KB  -12%    1.4 KB  1.1 KB  -23%
union-mix_tn_swap              5.9 μs  5.5 μs   -6%     47 KB   39 KB  -16%    289 B   204 B   -29%
treeowl commented 3 months ago

This isn't related to your changes, but it looks like the IntSet validity test doesn't ensure that each Tip contains at least one element.

meooow25 commented 3 months ago

@treeowl, anything I can do here?

treeowl commented 3 months ago

I'm dealing with a family medical emergency this week. I hope to give this a final review shortly.

meooow25 commented 1 month ago

@treeowl I hope you are able to find time for this. I would really like to finish this work.

treeowl commented 1 month ago

Yes, I'm so sorry about the delay. Problems continue on my end, but that's no reason to hold you up. Merged! Will you try the IntSet now?

meooow25 commented 1 month ago

Thank you!

Will you try the IntSet now?

This was the IntSet PR. Are you thinking of something else?

treeowl commented 1 month ago

No, I'm just incredibly tired from what's been going on at home.

On Tue, Jun 4, 2024, 9:43 AM Soumik Sarkar @.***> wrote:

Thank you!

Will you try the IntSet now?

This was the IntSet PR. Are you thinking of something else?

— Reply to this email directly, view it on GitHub https://github.com/haskell/containers/pull/998#issuecomment-2147573834, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOOF7IFP3VXBJOJRKYWDLLZFXAAHAVCNFSM6AAAAABF2HCEOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNBXGU3TGOBTGQ . You are receiving this because you modified the open/close state.Message ID: @.***>

meooow25 commented 1 month ago

I hope things get better 🙂