Nicoshev / rapidhash

Very fast, high quality, platform-independent hashing algorithm.
Other
181 stars 12 forks source link

More documentation? #1

Closed trapexit closed 4 months ago

trapexit commented 4 months ago

It would be nice to see explicit comparisons to wyhash and what was done to offer "improved speed, quality and compatibility."

Nicoshev commented 4 months ago

Hey @trapexit, thanks a lot for the inquiry.

As a first quick response, I wanted to leave the landing README as clean as possible.

As a detailed response, let's go point by point:

Regarding speed: I thought that pointing to SMHasher and SMHasher3 performance measurements would be better than uploading any custom benchmark, as they are neutral third parties. Their results show around 6% throughput increase compared to wyhash, similar to the measurements I've done. The gain is mostly derived from the loop unrolling.

Regarding compatibility: Support for MSVC was added. More specifically, 64-bit multiplication on ARM64 is now done using the proper intrinsics.

Regarding quality: rapidhash is listed as a 'passing hash' in SMHasher3 due to having no failing tests. In comparison, wyhash has 15 failing tests. I'll perform the collision-based study using wyhash and see how does it compare in that same condition. Using the length to xor the seed, decomposing the second loop into an explicit nested if that generates more entropy, aligning memory reads differently for lengths up to 16 bytes were some of the things that improved the quality

trapexit commented 4 months ago

Thank you for the details. I would suggest adding at least some of this to the main readme. As an existing user of wyhash I am hesitant to use a derivative without evidence of claimed improvements. Linking to SMHasher is certainly good to provide 3rd party evidence but particularly the changes and why's they were changed / how it made it better could be better explained.

Nicoshev commented 4 months ago

Thanks for the feedback 🤗. Please ask any other questions as needed. Feel free to 'star' the repo to give support ^^

Nicoshev commented 4 months ago

Just ran the same quality experiment using wyhash. The average found collisions of 8.06 is slightly higher than rapidhash' 7.72