rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.84k stars 177 forks source link

Good mixers and their evaluation #221

Closed dumblob closed 2 years ago

dumblob commented 2 years ago

Just stumbled upon https://github.com/martinus/better-faster-stronger-mixer and thought the shmhasher community might be interested in that (imperfect but good enough) evaluation of mixers (not whole hash functions).

I'm closing this though for lack of actionable items. Though I'd be interested in a discussion about mixers. So if you have anything to share, please post it here!

Thanks!

rurban commented 2 years ago

Well, it should be added to the README. Good overview

avaneev commented 2 years ago

Yes, this is a correct approach to designing a hash function. This is what I've basically did with my hashes in the course of development. However, it's hard to evaluate mixer's PRNG period if it's designed for 64-bit or larger state from the ground up, 2^40 period evaluation is not a perfect check. And this is partially confirmed by xxHash's massive collision tester - wyhash fails it.

I've evaluated my hashes on at least 2 sizes - 16-bit and 32-bit, a mixer should provide 32KB and 2GB PRNG periods for these, meaning it uses the whole combinatorial capacity of the state variables. For example, it's easy to downscale komihash to 32-bit variables.

avaneev commented 2 years ago

A more robust approach is to retain mixer's state and "mix-in" data after every PRNG output. This would also test collision-resistance, or dependence of output on mixer's prior state.

avaneev commented 2 years ago

So, there are two ways to test mixers: continuous and discrete. Discrete one, when a mixer starts from some fixed state on every PRNG output, however, requires differing number of mixing rounds. For example, komihash requires 2 mixing rounds while prvhash requires 3. Depends on how a mixer was designed. But for sure, a good mixer should work in both testing ways.