rust-lang / hashbrown

Rust port of Google's SwissTable hash map
https://rust-lang.github.io/hashbrown
Apache License 2.0
2.38k stars 275 forks source link

Add function to support pre-hashed keys #491

Closed jojo0severo closed 9 months ago

jojo0severo commented 9 months ago

Motivation

I'm working on a monad type with expensive method resolution (hash being one of them). To speed up the hashmap insertion I pre-hash my inner value when it is known, build my monad and insert into the HashMap using this new method with the pre-generated hash. It gave me a speed up of about 15%.

Implementation

The implementation is straightforward, using the same code path as insert but with a given hash. The use of mem::swap is necessary to ensure that the old value (if any) is dropped and proved to be faster on my benchmarks than ptr::write.

Amanieu commented 9 months ago

Have you considered using the HashTable type instead? It is specifically designed for this kind of use case.

There is also the raw entry API which provides similar functionality, but that is deprecated in favor of HashTable.

jojo0severo commented 9 months ago

Have you considered using the HashTable type instead? It is specifically designed for this kind of use case.

There is also the raw entry API which provides similar functionality, but that is deprecated in favor of HashTable.

I found the HashTable too low-level for what I'm doing, this Pull Request is just a suggestion, if considered not actually usefull I will just use my forked version keeping up-to-date with the library.

Edit 1.: the only method that I needed to have the pre-generated hash was the insert since I know its type beforehand.

Amanieu commented 9 months ago

This method is unsuitable because we don't want to expose custom hash values in HashMap/HashSet. This could cause unsoundness because it would allow constructing a HashMap<usize, usize> which has inconsistent behavior. This is relevant because no types with custom Hash impl are used: you should be able to trust that maps of standard types behave predictably.

This is why all functionality related to custom hashing was moved to HashTable. Even though it has a low-level API, you can easily build a wrapper on top of it for your use case.

jojo0severo commented 9 months ago

This method is unsuitable because we don't want to expose custom hash values in HashMap/HashSet. This could cause unsoundness because it would allow constructing a HashMap<usize, usize> which has inconsistent behavior. This is relevant because no types with custom Hash impl are used: you should be able to trust that maps of standard types behave predictably.

This is why all functionality related to custom hashing was moved to HashTable. Even though it has a low-level API, you can easily build a wrapper on top of it for your use case.

No problem, thanks for spending time looking into the PR 😄