Closed meooow25 closed 4 months ago
it will involve changing every function,
does it? transparent conversion between old and proposed new representation seems to me like the perfect use case for a pattern synonym - that GHC should be able to inline away.
I am skeptical of the overall effect on containers (certainly the original author(s) must have discussed this idea) but at least it could serve as a GHC test case.
does it? transparent conversion between old and proposed new representation seems to me like the perfect use case for a pattern synonym - that GHC should be able to inline away.
Thanks, that should help incrementally test and benchmark it.
at least it could serve as a GHC test case.
Do you mean checking how it affects GHC tests?
Documenting some ideas: An alternative to the code above is making nomatch
comparison based
i2w :: Int -> Word
i2w = fromIntegral
nomatch k pm = i2w k < i2w p || i2w (pm+m) < i2w k
where
p = pm .&. (pm-1)
m = pm - p
In fact it can even be done today
nomatch k p m = i2w k < i2w p || i2w (p+m+m) < i2w k
I wonder if this is better than the current version, which, expanded, looks like
nomatch k p m = (k .&. (-m `xor` m)) /= p
Definitely worth trying, if there really are enough bits! Give a pattern synonym a go; that should at least validate the semantics in a hurry. My experience with pattern synonyms is a little bit old. When I was messing around with them a bunch, I kept running into performance problems where they wouldn't inline well (IIRC, join point issues again). But now there are INLINE
pragmas available for them, which might help.
... performance problems with pattern synonyms
that's what I meant: the proposed transformation may serve as a test case for GHC's inliner.
Cool, I'll test this out, maybe some time in the next couple of weeks.
I'm hoping there is zero or positive overall effect.
I assume this will make the merges (and everything similar to them) worse. As mentioned nomatch
becomes slower, and shorter
has to retrieve each mask through n .&. (-n)
.
I wonder what the slowdown would be and whether bothering with it is even worth the meager 8 byte savings on each branch.
I have tested out the changes on IntMap
and I'm seeing modest improvements (5-15%) in lots of functions.
I'll prepare a PR. But for that, I would like to know if I should keep Bin
as a pattern synonym for backwards compatibility (and rename the constructor to something else). I'm inclined to go with "no", because
Data.IntMap.Internal
on Hackage. Seven, according to Serokell search: link. Not even all of them are using Bin
.Bin
, it is likely to get better performance for some operation. But a pattern synonym Bin
that has to do some work would perform worse, and be counter to the author's intentions.What do you think?
I think keeping it as a pattern synonym would probably be a good thing, but we can decide later.
This isn't particularly novel, so maybe it has been proposed before. Let me know if it has. I only found #340 in my search, which is a bigger idea.
Current definition
Today,
Bin
forIntMap
andIntSet
is represented as https://github.com/haskell/containers/blob/3c13e0b03bde9aab4ce1ea7b23be1c713b89d32b/containers/src/Data/IntMap/Internal.hs#L355-L362Potential new definition
The prefix and the mask can be merged so that we save one word per
Bin
.The mask bit is always zero in the prefix, so this isn't throwing away any information. The lowest set bit of the new int is the current mask and the rest of it is the current prefix.
Current branching on
Bin
Branching on
Bin
is currently done like this: https://github.com/haskell/containers/blob/3c13e0b03bde9aab4ce1ea7b23be1c713b89d32b/containers/src/Data/IntMap/Internal.hs#L813-L817 https://github.com/haskell/containers/blob/3c13e0b03bde9aab4ce1ea7b23be1c713b89d32b/containers/src/Data/IntMap/Internal.hs#L3527-L3528 https://github.com/haskell/containers/blob/3c13e0b03bde9aab4ce1ea7b23be1c713b89d32b/containers/src/Data/IntMap/Internal.hs#L3519-L3520New branching on
Bin
Performance impact
I don't know yet, I need to make the change and benchmark. And it will involve changing every function, so it will be take a while. Memory is certainly saved.
nomatch
gets a little more expensive, butleft
is cheaper thanzero
, so I'm hoping there is zero or positive overall effect.What do you think? Is it worth checking out how this will fare? And is there any bad consequence of this representation, that I didn't think of?