haskell-unordered-containers / unordered-containers

Efficient hashing-based container types
BSD 3-Clause "New" or "Revised" License
221 stars 99 forks source link

Unnecessary re-hashing of keys in `instance Hashable (HashMap k v)` #402

Open sjakobi opened 2 years ago

sjakobi commented 2 years ago

https://github.com/haskell-unordered-containers/unordered-containers/blob/b6bde46eae679111c3f088964f192290cc08b6c8/Data/HashMap/Internal.hs#L508-L530

For Leaf nodes, we skip hashing the keys. For Collision nodes we hash the keys. I think we should skip hashing the latter too.

sjakobi commented 2 years ago

It might be useful to hash the length of the Collision arrays though. For e.g. ByteString, these are hashed too.

treeowl commented 2 years ago

Good point. Another option is to get a fold that includes the hash and use that. Less efficient for collisions, but we assume those are rare.

sjakobi commented 2 years ago

Another option is to get a fold that includes the hash and use that. Less efficient for collisions, but we assume those are rare.

I don't understand what kind of fold would work here in combination with the necessary sorting of hashes?