martinus / robin-hood-hashing

Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
https://gitter.im/martinus/robin-hood-hashing
MIT License
1.5k stars 142 forks source link

Fast CRC32 based hash for hash_bytes and hash_int #91

Closed martinus closed 3 years ago

martinus commented 3 years ago
martinus commented 3 years ago

After lots of experimenting, I've created a CRC32-hardware based hash_bytes algorithm that is quite a bit faster than any other hash algorithms that I know of:

robin_hood und complexityN

The hash implementation favors speed over hash quality, so it is definitely quite a bit worse than any of the contender. The only quality criteria is to be good enough for hashmap.

ktprime commented 3 years ago

you need a full test on different os/compiler/cpu. on my cpu AMD 1700, _mm_crc32_u64 is just a liitle fast than FNV-A.

ktprime commented 3 years ago

I have do some bench your newest hash function with rand string and rand size(4-64).


clang 9.0.0 (tags/RELEASE_900/final) on llvm/gcc 4.2.1 __cplusplus = 201402 arm64 OS = linux, cpu = Kunpeng 920

stdhash time use = 419 ms wyhash time use = 264 ms martin time use = 351 ms phmap time use = 361 ms

stdhash time use = 623 ms wyhash time use = 441 ms martin time use = 620 ms phmap time use = 619 ms

stdhash time use = 591 ms wyhash time use = 380 ms martin time use = 589 ms phmap time use = 591 ms

stdhash time use = 624 ms wyhash time use = 449 ms martin time use = 625 ms phmap time use = 623 ms


clang 8.0.0 (tags/RELEASE_800/final) on llvm/gcc 4.2.1 __cplusplus = 201703 x86-64 OS = linux, cpu = Intel(R) Xeon(R) CPU E5620 @ 2.40GHz

stdhash time use = 676 ms wyhash time use = 290 ms martin time use = 262 ms phmap time use = 396 ms

stdhash time use = 486 ms wyhash time use = 329 ms martin time use = 291 ms phmap time use = 465 ms

stdhash time use = 424 ms wyhash time use = 304 ms martin time use = 273 ms phmap time use = 410 ms

stdhash time use = 480 ms wyhash time use = 337 ms martin time use = 300 ms phmap time use = 443 ms


clang 10.0.0 on llvm/ms vc++ 1927 __cplusplus = 201703 x86-64 OS = Win, cpu = AMD Ryzen 7 1700 Eight-Core Processor

stdhash time use = 632 ms wyhash time use = 274 ms martin time use = 345 ms phmap time use = 630 ms

stdhash time use = 634 ms wyhash time use = 277 ms martin time use = 346 ms phmap time use = 628 ms

stdhash time use = 637 ms wyhash time use = 277 ms martin time use = 357 ms phmap time use = 633 ms

stdhash time use = 634 ms wyhash time use = 275 ms martin time use = 346 ms phmap time use = 628 ms


gcc 9.3.0 __cplusplus = 201703 x86-64 OS = linux, cpu = Intel(R) Xeon(R) CPU E5620 @ 2.40GHz

stdhash time use = 497 ms wyhash time use = 257 ms martin time use = 233 ms phmap time use = 436 ms

stdhash time use = 498 ms wyhash time use = 289 ms martin time use = 247 ms phmap time use = 438 ms

stdhash time use = 484 ms wyhash time use = 296 ms martin time use = 251 ms phmap time use = 428 ms

stdhash time use = 464 ms wyhash time use = 297 ms martin time use = 252 ms phmap time use = 445 ms


gcc 7.5.0 __cplusplus = 201402 arm64 OS = linux, cpu = Kunpeng 920

stdhash time use = 624 ms wyhash time use = 444 ms martin time use = 537 ms phmap time use = 534 ms

stdhash time use = 665 ms wyhash time use = 420 ms martin time use = 613 ms phmap time use = 625 ms

stdhash time use = 622 ms wyhash time use = 408 ms martin time use = 581 ms phmap time use = 597 ms

stdhash time use = 613 ms wyhash time use = 436 ms martin time use = 591 ms phmap time use = 607 ms


gcc 10.2.0 __cplusplus = 201703 x86-64 OS = Win, cpu = AMD Ryzen 7 1700 Eight-Core Processor

stdhash time use = 339 ms wyhash time use = 228 ms martin time use = 297 ms phmap time use = 338 ms

stdhash time use = 336 ms wyhash time use = 225 ms martin time use = 295 ms phmap time use = 335 ms

stdhash time use = 338 ms wyhash time use = 226 ms martin time use = 290 ms phmap time use = 336 ms

stdhash time use = 339 ms wyhash time use = 224 ms martin time use = 296 ms phmap time use = 336 ms


ms vc++ 1927 x86-64 OS = Win, cpu = AMD Ryzen 7 1700 Eight-Core Processor

stdhash time use = 635 ms wyhash time use = 336 ms martin time use = 346 ms phmap time use = 648 ms

stdhash time use = 652 ms wyhash time use = 346 ms martin time use = 348 ms phmap time use = 632 ms

stdhash time use = 651 ms wyhash time use = 343 ms martin time use = 346 ms phmap time use = 630 ms

stdhash time use = 625 ms wyhash time use = 344 ms martin time use = 345 ms phmap time use = 633 ms

martinus commented 3 years ago

Thanks a lot for the benchmarks! I only have access to an Intel i7-8700, so this is very appreciated.

What are the 4 benchmark results, is these from 4 different evaluations?

ktprime commented 3 years ago

//bench code like this.

static void buildRandString(int size, std::vector& rndstring, int str_min, int str_max) { std::mt19937_64 srng; srng.seed(time(0)); for (int i = 0; i < size; i++) rndstring.emplace_back(get_random_alphanum_string(srng() % (str_max - str_min) + str_min)); }

static void testHashString(int size, int str_min, int str_max) { std::vector rndstring; rndstring.reserve(size + 8);

char os_info[2048]; printInfo(os_info);
long sum = 0;
for (int i = 0; i < 4; i++)
{
    rndstring.clear();
    buildRandString(size + i, rndstring, str_min, str_max);
    int64_t start = 0;
    int t_find = 0;

    start = getTime();
    for (auto& v : rndstring)
        sum += std::hash<std::string>()(v);
    t_find = (getTime() - start) / 1000;
    printf("stdhash time use = %4d ms\n", t_find);

    start = getTime();
    for (auto& v : rndstring)
        sum += wyhash(v.data(), v.size(), 1);
    t_find = (getTime() - start) / 1000;
    printf("wyhash  time use = %4d ms\n", t_find);

    start = getTime();
    for (auto& v : rndstring)
        sum += robin_hood::hash<std::string>()(v);
    t_find = (getTime() - start) / 1000;
    printf("martin  time use = %4d ms\n", t_find);

    start = getTime();
    for (auto& v : rndstring)
        sum += phmap::Hash<std::string>()(v);
    t_find = (getTime() - start) / 1000;
    printf("phmap   time use = %4d ms\n\n", t_find);
}
printf("sum = %ld\n", sum);

}

testHashString(12345678, 4, 64);

martinus commented 3 years ago

won't do, I don't want to fix intrinsics and compatibility bugs any more.