rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.85k stars 178 forks source link

One more PRVHASH variant: prvhash82 #134

Closed avaneev closed 4 years ago

avaneev commented 4 years ago

I've just released prvhash82 function - it is the same algorithm, but extended to 128-bit math, produces hashes in 64-bit length increments. It's twice as fast as prvhash42, the only drawback is it's not very portable and not very good for short messages (bad biases for 16-bit sparse and combinations). Otherwise it shows exponentially better bit biases in most tests. Maybe you could add prvhash82_64 and prvhash82_128 tests? https://github.com/avaneev/prvhash

rurban commented 4 years ago

I don't see the speed advantage yet. Even slower than prvhash42 for me, at least on a Ryzen 3. On Intel it's better, but about the same slowness as prvhash42

82_128 fails TwoBytes

I'll rather not add these in its current state

avaneev commented 4 years ago

There is a 70-100% speed advantage with both Intel C++ Compiler and LLVM, on Ryzen 3 as well.

avaneev commented 4 years ago

OK, no proble, Sparse 16-bit and taking 2-byte hashes for 128-bit hash length is not a good idea at all for the test.

avaneev commented 4 years ago

there's no collisions, but bias may be enormous

avaneev commented 4 years ago

that's why passwords currently have to be at least 8 chars long

avaneev commented 4 years ago

You could at least skip bias tests when you pass 16-bit keys to the hashing function, only check for collisions.

rurban commented 4 years ago

See branch prvhash82

avaneev commented 4 years ago

Thank you. Well, something is indeed wrong. I'll try to fix the problem with prvhash82. Which compiler are you using which gives such poor int128 results?

avaneev commented 4 years ago

I have found an obvious bug in prvhash82, not related to the core function itself. I've also found an elegant solution to finalize hashing - "impossible byte". So, now both prvhash82 and prvhash42 should be re-tested, at version 2.6 now. Both functions now pass 32-,64-,128- and 256-bit tests. They are now also several instructions shorter. Thanks for your efforts with SMHasher! It helped me much to debug the hashing code.

avaneev commented 4 years ago

I've just released verison 2.7 - it has twice the speed of version 2.6. v2.7 displays a bit worse biases overall as it's working at the edge of its possibilities now. I do not know how biases are important for irreversible function though.

rurban commented 4 years ago

Added 2.8, good work

avaneev commented 4 years ago

Thanks! Biases can further be reduced by adding hl42 to int c (in prvhash42), and hl84 (in prvhash82). Not a problem of the method.

avaneev commented 4 years ago

There seems to be some issue with prvhash42_32 speed stats - it should be faster than prvhash42_64.

avaneev commented 4 years ago

It's also pretty strange cycl./map stats didn't change much from previous 2x slower variant.