ksahlin / strobemers

A repository for generating strobemers and evalaution
71 stars 11 forks source link

Upgrade XXhash to XXH3 add wyhash #9

Closed rob-p closed 2 years ago

rob-p commented 2 years ago

Hi @ksahlin,

Since the concat method works (not necessarily equally well) with any hash, I added wyhash, which also passes all randomization benchmarks but is much faster than XXHash. I also updated the XXH64 calls to XXH3 — since this is the newer and faster version of that hash function.

Sorry for the code duplication; I did this while also trying to manage the kids eating breakfast ;P.

--Rob

ksahlin commented 2 years ago

Thanks a lot! Excited to see how many improvements can be made :) Just wanted to check one thing before I merge.

About line 154; XXH3_64bits(&x, sizeof(uint64_t));. Is there a reason you did not update it with a seed as for line 453; XXH3_64bits_withSeed(&strobeconcat, sizeof(int128), 0) or just a bug? :)

From what I remember when implementing this, XXHASH uses different seeds upon call, so seeds extracted from queries would not match the ones from the reference if not setting seed explicitly.

rob-p commented 2 years ago

I think the plain variant is equivalent to calling with a seed of 0 (https://chromium.googlesource.com/external/github.com/Cyan4973/xxHash/+/refs/tags/v0.7.2/xxhash.h#425), but it would probably be cleaner to have a uniform call at both places. I can make the change when I get back to a computer or you can go ahead and make the modification.

rob-p commented 2 years ago

Ok, in the latest commit the withSeed variant is used in all places.