rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.85k stars 178 forks source link

Add hmakholm/smhasher LongNeighborTest #58

Open rurban opened 5 years ago

rurban commented 5 years ago

Good test esp. for long blocks (Spooky), and the separate byte-wise processing of unaligned rest parts. https://github.com/hmakholm/smhasher/blob/master/src/LongNeighborTest.md

This test uses a meet-in-the-middle approach to look for hash collisions between messages with low Hamming distances.

Developing the test was prompted by the discovery that SpookyHash V2 has lots of 3-bit collisions due to insufficient diffusion between the processing of the last full input block and the partial block at the end of the input. The goal of the test is to discover similar trouble in other hashes.

Because Spooky has an unusually large block size of 96 bytes (=768 bits), and it uses a completely different algorithm for messages shorter than 2 blocks, this problem only shows up for messages of length 281 to 287 bytes, and then again from 377 to 383 bytes, etc. Notably, messages of any "nice" length are not affected, which means that we need to test for all possible message lenghts.

Just need permission from Semmle Ltd

rurban commented 5 years ago

See the branch: https://github.com/rurban/smhasher/tree/LongNeighborTest I've added --verbose support.