rurban / smhasher

Hash function quality and speed tests
https://rurban.github.io/smhasher/
Other
1.84k stars 178 forks source link

Add NMHASH32 #190

Closed gzm55 closed 3 years ago

gzm55 commented 3 years ago

https://github.com/gzm55/hash-garage/blob/master/nmhash.h

This is a new simple 32bit hash. a) friendly to low profile hardware, using only narrow multiplications (16 x 16 -> 16), and 32bit addition, shift and xor b) easy to vectorize, the scalar version of core loop should be auto vectorized by gcc for x64 and achieve about 80% performance of the manually vectorization. c) learn many idea from Cyan4973/xxHash, wangyi-fudan/wyhash and skeeto/hash-prospector

dumblob commented 3 years ago

Some other good ideas also in blog posts of mx3 hash :wink:.

rurban commented 3 years ago

Nice. Do you plan to add several megabytes to the repo, or can I just git submodule it asis?

gzm55 commented 3 years ago

Nice. Do you plan to add several megabytes to the repo, or can I just git submodule it asis?

submodule is ok. this repo may contain other hash/mixer i experimented in the future, but they cannot be too large.

rurban commented 3 years ago

nmhash.h:490 -Wshift-count-overflow

gzm55 commented 3 years ago

nmhash.h:490 -Wshift-count-overflow

should fix all warning now, and do not need '/std:c++latest' for msvc

rurban commented 3 years ago

My qualification:

Excellent quality and due to the SSE2/AVX2 optimizations very fast for large keys. But due to being 32bit only, it's mostly usable for hash tables, and there the most important short key speed lacks, because it's too much code. It's half as slow as the best, xxh3 and wyhash.

gzm55 commented 3 years ago

yes, due to limit to 16bit (half 32bit) multiplication and 32bit linear op, it has to multiply twice per 32bit and do more shift/add.

wyhash and xxh3 take the advantage of the more powerful instructions, while on some low profile hardwares, the full/half 64bit MUL would be expensive.

gzm55 commented 3 years ago

Some other good ideas also in blog posts of mx3 hash 😉.

@dumblob i've read most of your mixer posts, and this hash indeed borrow some idea when constructing and measuring the core mixer.

When mix the bits with reversible integer operations, mul-style op are still more efficiency than shift-add and shift-xor. As a pure 32bit hash, not a low part of a 64bit/128bit hash, the mul should also be lean, like xxhash32 and wyhash32.

gzm55 commented 3 years ago

@rurban why the #include "nmhash.h" is enclosed in extern "C" {...} which should be already done in the nmhash.h

rurban commented 3 years ago

I was not sure if everything was wrapped. So it's clearer that's C code. I it moved to Hashes.o so I can apply the std=c++latest property for MSVC. Better to seperate the C objects

gzm55 commented 3 years ago

std=c++latest

OK, and in the latest commit, std=c++latest is not required now for msvc

rurban commented 3 years ago

Yeah, looks good

rurban commented 3 years ago

Nope. MSVC on appveyor with MSVC 2017 is now broken.

https://ci.appveyor.com/project/rurban/smhasher/builds/38656782/job/8sgrxg8sdl23s84w

See https://dev.to/yumetodo/list-of-mscver-and-mscfullver-8nd

I'll try some thing in my msvc branch, esp.

#   if !defined(_MSC_VER) || _MSC_VER > 1927

Yes, this worked. See the PR https://github.com/gzm55/hash-garage/pull/1