rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.86k stars 179 forks source link

Add PolymurHash #267

Closed orlp closed 1 year ago

orlp commented 1 year ago

Hi, I just released PolymurHash (https://github.com/orlp/polymur-hash) and I'd love it if it was included in your benchmark.

Inclusion should be really easy, it's a header-only C library without any dependencies/configuration. Only caveat is that it should be initialized with a seed separately in Hash_init, with the tweak parameter used for what SMHasher calls seed in the actual hash function.

I can make a pull request if you prefer.

rurban commented 1 year ago

Please check the polymur branch

orlp commented 1 year ago

That was quick! Looks good, just one nitpick on the code. Instead of 0xfedbca9876543210 | seed I'd suggest 0xfedbca9876543210 ^ seed in polymur_seed_init just to make sure no bits of the seed get lost.

Also note that, depending on the seed, there is a ~7% chance PolymurHash can fail the Sparse 16-bit key test. But I believe this to be a flaw in the test as even a perfect random oracle does this, see https://github.com/rurban/smhasher/issues/114#issuecomment-1587631635 . So perhaps you may want to re-open that issue.

And perhaps PolymurHash could get added to your list of "fastest hash functions on x86_64 without quality problems", because I do think its speed qualifies it as it's faster than some of those on the list, and Polymur gives much better guarantees than all of those on the list.

rurban commented 1 year ago

added