by making the combine function they use a good consumer and producer in terms of fusion. This gets rid of the intermediate list always created today, and also fuses with the input list if possible.
Set
fromAscList: OK (0.18s)
21.3 μs ± 1.4 μs, 128 KB allocated, 1.4 KB copied, 8.0 MB peak memory, 57% less than baseline
fromAscList:fusion: OK (0.89s)
13.3 μs ± 221 ns, 144 KB allocated, 1.9 KB copied, 8.0 MB peak memory, 85% less than baseline
fromDescList: OK (0.18s)
20.9 μs ± 1.6 μs, 128 KB allocated, 1.5 KB copied, 8.0 MB peak memory, 59% less than baseline
fromDescList:fusion: OK (0.23s)
13.7 μs ± 737 ns, 144 KB allocated, 2.0 KB copied, 8.0 MB peak memory, 84% less than baseline
Map
fromAscList: OK (0.22s)
25.5 μs ± 2.0 μs, 152 KB allocated, 2.1 KB copied, 10 MB peak memory, 39% less than baseline
fromAscList:fusion: OK (0.30s)
18.0 μs ± 677 ns, 232 KB allocated, 4.3 KB copied, 10 MB peak memory, 71% less than baseline
fromDescList: OK (0.21s)
25.1 μs ± 1.4 μs, 152 KB allocated, 2.1 KB copied, 10 MB peak memory, 38% less than baseline
fromDescList:fusion: OK (0.20s)
50.2 μs ± 3.1 μs, 577 KB allocated, 11 KB copied, 10 MB peak memory, 20% less than baseline
Some observations:
There is no allocations penalty (unlike #950), likely because we rid of the intermediate list and avoid a lot of allocations there.
Map's fromDescList:fusion improves but not as much as fromAscList:fusion. The cause seems be that fusion with [n,n-1..1] does not work out as well as [1..n], but I don't know why and didn't dig into it much.
Following up on https://github.com/haskell/containers/issues/949#issuecomment-1574528332
This PR is based off #962, so marking as draft.
This PR improves
Set.fromAscList
,Set.fromDescList
Map.fromAscList
,Map.fromDescList
,Map.fromAscListWith
,Map.fromDescListWith
,Map.fromAscListWithKey
,Map.fromDescListWithKey
by making the combine function they use a good consumer and producer in terms of fusion. This gets rid of the intermediate list always created today, and also fuses with the input list if possible.
Benchmarks on GHC 9.2.5:
Before
``` Set fromAscList: OK (0.20s) 50.1 μs ± 2.9 μs, 240 KB allocated, 2.4 KB copied, 8.0 MB peak memory fromAscList:fusion: OK (0.19s) 89.5 μs ± 5.7 μs, 528 KB allocated, 6.8 KB copied, 8.0 MB peak memory fromDescList: OK (0.22s) 51.2 μs ± 2.9 μs, 240 KB allocated, 2.4 KB copied, 8.0 MB peak memory fromDescList:fusion: OK (0.18s) 88.9 μs ± 5.7 μs, 559 KB allocated, 7.3 KB copied, 8.0 MB peak memory Map fromAscList: OK (0.18s) 41.9 μs ± 3.6 μs, 343 KB allocated, 5.2 KB copied, 10 MB peak memory fromAscList:fusion: OK (0.13s) 63.0 μs ± 6.0 μs, 792 KB allocated, 15 KB copied, 10 MB peak memory fromDescList: OK (0.17s) 40.5 μs ± 3.9 μs, 343 KB allocated, 5.5 KB copied, 10 MB peak memory fromDescList:fusion: OK (0.25s) 63.1 μs ± 5.3 μs, 824 KB allocated, 16 KB copied, 10 MB peak memory ```After
Some observations:
fromDescList:fusion
improves but not as much asfromAscList:fusion
. The cause seems be that fusion with[n,n-1..1]
does not work out as well as[1..n]
, but I don't know why and didn't dig into it much.