rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.85k stars 178 forks source link

[Question] Is the hash function one time called for each 262144 bytes key? #257

Closed alpominth closed 1 year ago

alpominth commented 1 year ago

This text left me a doubt.

[...] alignments 0-7 with 262144-byte keys.

/\ Is the hash invoked one time per each 262144-bytes piece of data?

I don't know how to read C code, that's why I'm asking.

alpominth commented 1 year ago

It may seem that this is a newbie's question, but @rurban said that 262144-bytes keys are read sequentially and entirely, this left me a doubt:

Take a fast hash function tested by SMhasher, for example Blake3, it has 1288.84 MiB/S is a modern processor.

As the output is measured, 1288.84 MiB/S equals ~1351446691-bytes that means ~42232709 outputs of Blake3, each output has 32-byte (256-bits).

But if we multiply the number of Blake3 invocations per 262144-byte keys hashed sequentially as input we have this: 42232709 * 262144 = 11071051268096-bytes read as input (11.07-terabytes).

How can it be possible that the input of a hash function tested can read data at 11.07-terabytes per second?

alpominth commented 1 year ago

I took a look at this piece of code: https://github.com/rurban/smhasher/blob/82e7844c75d9c288f4833b1c1309ee4f3d629c3f/SpeedTest.cpp#L271

As I can see, the tests measure the input speed.

Thanks.