cplusplus / papers

ISO/IEC JTC1 SC22 WG21 paper scheduling and management
653 stars 18 forks source link

P0814 hash_combine() Again #253

Open jensmaurer opened 5 years ago

jensmaurer commented 5 years ago

P0814R2 hash_combine() Again (Nicolai Josuttis)

https://issues.isocpp.org/show_bug.cgi?id=21

jensmaurer commented 5 years ago

Titus Winters: In wording review according to bugzilla.

TheThief commented 3 years ago

Having investigated hash combining algorithms recently, boost's hash_combine (used as an example in this paper) is old and not amazing.

When std::hash is weak (and it's often the identity function for single inputs), it essentially falls to hash_combine to be a hash function. As such, I would propose implementers using something stronger like murmur3's hash function instead (or something like it), at least on any system with a reasonably fast multiply. murmur3 reduces quite small if you limit it to a known size of input (e.g. a 32/64 bit seed hash and a single 32 or 64-bit hash as the "data") and you can skip the "tail" (because there wouldn't be one) and the xor by len (because it becomes a guaranteed flip of a single bit, which does nothing useful for hash strength) in that case altogether. You can also skip the operations on "k1" etc as that's hashing the data block, which is in theory already done before hash_combine() is called. It leaves a very small but effective combine function.

To quote a couple of analyses I found (both on stack overflow, but no less valid for being so):

https://stackoverflow.com/a/50978188 "There are 2948667289 distinct results of boost::hash_combine(x,0), but there should be 4294967296."

https://stackoverflow.com/a/57556517 "A multiplication with an unsigned integer ("Knuth's multiplicative method") is much better [than an xor shift], cascading more strongly and flipping more output bits with a probability of 0.5. Combining the two gives the following output [mostly green diagram indicating good bit mixing] A second application of multiplication and xorshift will yield the following [almost completely green diagram indicating exceptional bit mixing]" "This is in fact also the recipe rrxmrrxmsx_0 and murmur hash are using"

jensmaurer commented 3 years ago

@TheThief, note that this issue tracker is for administration of papers only.

If you wish to opine on the contents of a paper, please use the appropriate reflector or contact your NB representative.

TheThief commented 3 years ago

My apologies!

I assume the correct place is the bug link in the first comment? But that doesn't work any more.

jensmaurer commented 3 years ago

Please read https://isocpp.org/std

TheThief commented 3 years ago

I have tried, but I can't work out how I can make a public comment on this paper specifically. Sorry. I'll just drop it.

tahonermann commented 3 years ago

@TheThief, I recommend that you post your feedback to the std-proposals mailing list.

brycelelbach commented 2 years ago

After a Library Evolution mailing list discussion, we're going to revisit this topic and consider it for the International Standard instead of a Technical Specification.