Open meooow25 opened 6 months ago
Think this could technically be represented with existing constructors
If the lowest bit in the Mask
is set, that means there's two elements in the full sub-tree sharing all but the last bit. If a higher bit in the Mask
is set, the lower ones are ignored so maybe one could have
isFullTree = mask .&. 1 == 1
fullTree prefix mask = Bin prefix (mask .|. 1) Tip Tip
This saves a constructor entry at the cost of 2 Tip
in the full sub-tree case (which is still better than an actually full tree) but might require more bit fiddling to account for the possibly-set lower bit in the existing operations.
Ah sure, it could be done by assigning the case to some currently invalid configuration. But what is the benefit of keeping the number of constructors at 3?
We can even reduce it to 2 today if we want, by replacing Nil
with, say, Tip 0 0
. Again, I don't know if or why that would be good.
Replacing Nil
with Tip 0 0
is an interesting idea. I wonder how that would perform.
But what is the benefit of keeping the number of constructors at 3?
My thinking was it might decrease the amount of branch prediction needed in some cases, (assuming the bit fiddling doesn't eclipse benefits), but it also depends on what constructor matching gets compiled to and I don't know how ghc behaves there.
I found an implementation of this: https://hackage.haskell.org/package/intset
Haven't looked into how it compares performance-wise yet.
I wonder if it will be useful to add another constructor to
IntSet
:Full Prefix
. This indicates that this is a full tree with a shared prefix.I expect this to show good improvements for use cases where contiguous runs of
Int
s are likely. It will also reduce space usage in such cases. For instance, to store[0..n]
we will need only $O(\log n)$ memory instead of $O(n)$.But this would add small overheads in various places, so it will likely not be a clear win in all use cases.
Also, to be clear, this would be small added costs and not a change in the worst case bounds.
For sure we would want to test something like this out on some real world use cases (like GHC, maybe others?) to see how it affects things. Just documenting this idea for now.