rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.24k stars 12.57k forks source link

compare compiler performance with `hashbrown` #55514

Closed nikomatsakis closed 5 years ago

nikomatsakis commented 5 years ago

This https://github.com/Amanieu/hashbrown crate looks nice!

We should do a performance comparison.

cc @Amanieu

Mark-Simulacrum commented 5 years ago

Should be fairly easy -- well, except I think we now don't have the FxHash implementations in-tree so that might be a bit harder...

I think this is a great issue for someone to take on though, if they want to swap out the FxHash implementation and post a PR we can run perf on it!

Centril commented 5 years ago

Better yet... if it is a fully drop-in replacement for HashMap, could we just replace the standard library's implementation with the one in hashbrown?

shepmaster commented 5 years ago

a fully drop-in replacement

depends on if "drop in" means "adheres to every expected requirement of HashMap" or just "API compatible".

shepmaster commented 5 years ago

It also lists

Uses SIMD to speed up lookups

It'd be very good to performance test it on all the architectures we care about people running the compiler on. If we replace the standard library implementation , then expand that to every tier 1 platform.

Amanieu commented 5 years ago

@shepmaster "drop-in" means that I literally copied the HashMap API from libstd and ripped out the internals. The only significant change API-wise is that the default hasher is changed.

Centril commented 5 years ago

@Amanieu are there any guarantees not covered in the signatures that don't fit other than the default hasher? (we would retain the standard library's default hasher...)

Amanieu commented 5 years ago

There is one small detail due to the way re-hashing works: if the Hash impl panics during rehashing then any elements that haven't been rehashed yet will be dropped from the table. There is unfortunately no way around this since the full hashes are not stored in the table.

Centril commented 5 years ago

@Amanieu ah; that seems like sane behavior tho (or at least I don't think it's weird behavior...) and very subtle / unlikely to happen. I'm having a hard time imagining a situation where the the Hash impl would not panic on insertion but then panic on rehash... maybe for some type with interior mutability perhaps (where the Hash impl alters the value of the type / using some sort of global side effect, e.g. reading from file... but that's probably going to be an incorrect implementation...)? Perhaps I'm not creative enough... =P

scottmcm commented 5 years ago

Is there anything that needs to be accounted for in https://github.com/rust-lang/rust/pull/54043 to ensure that such a replacement would be possible in the future?

Amanieu commented 5 years ago

@scottmcm There is no issue there, the raw entry API can easily be ported to hashbrown.

Zoxc commented 5 years ago

We'd also want a way to avoid double lookups when interning like my PR did. The raw_entry API is supposed to enable that for the libstd's HashMap.

It would be a good idea to pick up the refactoring to use a intern method in my PR. It could just be an utility function in rustc instead of being a method on the hash map.

Amanieu commented 5 years ago

I'm having some trouble replacing rustc_data_structures::FxHashMap: hashbrown uses core::arch for SSE2 support, but core::arch is missing in stage0.

Zoxc commented 5 years ago

You can replace it only for stage1+, using #[cfg(not(stage0))] and #[cfg(stage0)]

Amanieu commented 5 years ago

Is there a way to specify a dependency only for stage1+ in Cargo.toml? I already have the #[cfg], but Cargo is still trying to compile the crate since it is listed as a dependency of rustc_data_structures.

Zoxc commented 5 years ago

I don't know of a way to do that.

@gnzlbg Do you know of a workaround here? Or why arch is not in stage0?

gnzlbg commented 5 years ago

Or why arch is not in stage0?

arch depends on compiler and LLVM internals, so it is not in stage0 because it often cannot be built with beta. Does that make sense?

Do you know of a workaround here?

@Amanieu do you have a way to manually enable the usage of arch in your library, e.g., via a cargo feature ? If so you could maybe try disabling the usage of arch for stage0, and enabling it in stage1. The alternative would be to try to do what you mention: use the std Hash{Map,Set} in stage0, and only use hashbrown in stage1, but I don't know how to do that =/

Is there a way to use the arch module from beta while building stage0 instead of the arch module from the std lib being built? That might be another option.


Looking at hashbrown, that looks a lot like the new swiss tables data-structures in abseil (https://abseil.io/blog/20180927-swisstables), looks like a really cool project!

shepmaster commented 5 years ago

that looks a lot like the new swiss tables data-structures

Yes, the project description suggests that's not an accident:

image

Amanieu commented 5 years ago

I ended up just making a separate branch for it: https://github.com/Amanieu/hashbrown/tree/stage0

shepmaster commented 5 years ago

@Amanieu If your library cannot compile without arch, does that mean that mean it doesn't support non-x86 platforms?

Amanieu commented 5 years ago

No, it's just that it tries to use arch if the sse2 feature is available on x86.

eddyb commented 5 years ago

cc @Gankro

nikomatsakis commented 5 years ago

@Amanieu are you pursuing this integration further? I think if you can get a branch that works, you just need to open a PR and we can run the initial perf measurements.

michaelwoerister commented 5 years ago

cc @rust-lang/wg-compiler-performance

nagisa commented 5 years ago

Discussed during the compiler meeting. It would be nice to have a faster hashmap for the compiler, but it would also be interesting to see what are the options for including the algorithm into the standard library in the first place (the compiler would automatically reap the benefits without any extra effort necessary).

There were questions about what exactly makes these hashmaps faster/better than those in libstd. What are the pros/cons etc.

Amanieu commented 5 years ago

I started work on a branch which replaces FxHashMap in rustc_data_structures: https://github.com/Amanieu/rust/commit/264261979b9b4d4b723a04c3392dc23ba5d71b03

It doesn't compile yet, and I am currently busy with other things so I can't work on this.

Simply replacing the standard hash map in libstd as @nagisa suggests might be an option, however there are several factors that we need to take into account:

Gankra commented 5 years ago

I definitely wouldn't humor replacing the implementation in libstd without way more benchmarking, but rustc is basically a giant hashmap benchmark if I recall correctly, so doing a swap in should be pretty useful info?

I don't understand why this crate is supposed to be API compatible but isn't trivial to swap in and test? whoops found it; yuck

Gankra commented 5 years ago

Also if I recall correctly, I explicitly made sure that requiring rehashing on insert was API compatible with the new raw_entry API (it requires K: Hash even if you're providing a custom hash function to get the entry). Someone should make sure that wasn't dropped in the new PR which I haven't been following.

jonhoo commented 5 years ago

@Gankro pinged @fintelia about it!

fintelia commented 5 years ago

Yeah all the new functions also have the K: Hash bound place, so the new raw_entry API should allow rehashing at any point. In fact, since HashMap::new() and similar all require that same bound, you can't even create a HashMap without hashable keys!

nnethercote commented 5 years ago

rustc is basically a giant hashmap benchmark if I recall correctly

Two years ago I described rustc's execution as "mostly malloc/free and hash tables". That is less true than it used to be, but hash tables are still used a lot.

If hashbrown doesn't store key hashes it should use less memory, right?

eddyb commented 5 years ago

It might be interesting to see if we could start caching hashes e.g. in Ty, to avoid the cost of hashing a pointer, and maybe even allow the hashes to be persisted.

Amanieu commented 5 years ago

I found a potential breaking change while porting hashbrown to std. The Entry<'a, K, V> type needs to be changed to Entry<'a, K, V, S>. This is needed because VacantEntry::insert needs to be able to re-hash the table contents when it needs to grow the table.

Amanieu commented 5 years ago

cc #56241

tkaitchuck commented 5 years ago

I notice that the benchmarks listed are with maps using integers as keys. How does it do with medium length strings? (5-20 character) This is a very common for hashmap keys, and hashing time would likely dominate, making re-hashing potentially expensive.

Voultapher commented 5 years ago

Some time ago I started working on a hashmap benchmark and comparison project https://github.com/Voultapher/hashmap-compare. It's still very WIP. Among other things its missing result analysis and visualization.

Looking at the raw output of std::collections::HashMap with fx_hasher and hashbrown::HashMap shows hashbrown only being faster under certain circumstances, and slower in others, especially for small hash map sizes.

https://gist.github.com/Voultapher/126dda90923c4c5b24135a9ecff403f3

ghost commented 5 years ago

@Voultapher Can we perhaps get the difference between hash tables printed in the cargo benchcmp format? It'd help to see the difference between fnv and hashbrown side-by-side.

Unless I'm missing something, it looks like hashbrown is worryingly slower only in the tiny case with integer keys and values?

@Amanieu Can we do anything to speed up small hash tables (~10 entries)?

Amanieu commented 5 years ago

What is happening here is that the benchmarks include the time taken to grow the table. Unlike the std hashmap, hashbrown starts at a size of 1 element and double in size as needed. On the other hand the standard hashmap immediately starts with 32 elements.

Of course, this is something that can be tweaked. The intention was to avoid excessive memory usage when having many tables with few elements.

ghost commented 5 years ago

Of course, this is something that can be tweaked. The intention was to avoid excessive memory usage when having many tables with few elements.

Good to hear this will be easy to fix then! We'll probably want to have faster growth for small hash tables without compromising too much on memory usage. Maybe something like 0 -> 4 -> 16 -> 32 -> 64 ...

Amanieu commented 5 years ago

@Voultapher If you want, try changing HashMap::new() to HashMap::with_capacity(32). This should give a better performance comparison.

SimonSapin commented 5 years ago

Unlike the std hashmap, hashbrown starts at a size of 1 element and double in size as needed. On the other hand the standard hashmap immediately starts with 32 elements.

Perhaps this behavior could be adaptive based on size_of::<(K, V)>()?

Voultapher commented 5 years ago

@Voultapher If you want, try changing HashMap::new() to HashMap::with_capacity(32). This should give a better performance comparison.

CONFIGURATION: hashbrown_map,reserve_hm is essentially that.

shepmaster commented 5 years ago

hashbrown starts at a size of 1 element

Do I understand correctly that this means that HashMap::new would now allocate memory? As I understand it, that's likely to cause performance degradation in the overall system. It would also cause a break with the current documentation of HashMap:

it will not allocate until it is first inserted into

Amanieu commented 5 years ago

@shepmaster No, both the old and new HashMap do not allocate any memory when created. However once an element is inserted, hashbrown will gradually grow the capacity (2,4,8,16, ...) whereas the current HashMap will immediately start at 32 buckets.

gnzlbg commented 5 years ago

FWIW I prefer @Amanieu 's approach of not starting at 32 elements. If a user finds this to be a performance issue, the solution is easy: use ::with_capacity(32).

However, if a user needs hash maps that are smaller than 32 elements (e.g. due to memory consumption concerns), and the hash map always start at 32 elements, the solution isn't really that easy anymore.

This might worsen the performance of some applications for which a 32 element start was paying off, but AFAICT these applications might perform even better by reserving some other number of elements, and relying on a specific growth policy of ::new() + insert sounds like a performance bug to me.

Gankra commented 5 years ago

I agree to a point. But if this is super common we're just creating a performance trap for most of our users to make the system more flexible for the minority. Also we could just as easily write the map with the following logic (in fact I would expect it to have this form):

if is_empty() { alloc(32) } else { realloc(cap * 2) }

And users who want slow growing can with_capacity(1).

That said I think it's fine to land with the either growing logic and refine it in followups with realworld data.

arthurprs commented 5 years ago

I notice that the benchmarks listed are with maps using integers as keys. How does it do with medium length strings? (5-20 character) This is a very common for hashmap keys, and hashing time would likely dominate, making re-hashing potentially expensive.

We probably want to take a look at this sooner rather than later, specially considering that Rust encourages/defaults to Sip13.

Most (All?) benchmarks so far are focusing on very fast hashers and integer keys.

ghost commented 5 years ago

Here are @Voultapher's results in cargo benchcmp format (fx vs hashbrown):

cmp

``` name fx_hash ns/iter hb_map ns/iter diff ns/iter diff % speedup tests::big::clone 6,507,451 4,412,083 -2,095,368 -32.20% x 1.47 tests::big::copy_element_wise 9,405,607 8,101,581 -1,304,026 -13.86% x 1.16 tests::big::fill_only 5,362,458 3,664,157 -1,698,301 -31.67% x 1.46 tests::big::insert_random 7,773,228 6,236,963 -1,536,265 -19.76% x 1.25 tests::big::lookup_all 5,816,665 4,389,039 -1,427,626 -24.54% x 1.33 tests::big::lookup_missing 7,505,441 4,685,889 -2,819,552 -37.57% x 1.60 tests::big::lookup_one 5,367,760 3,664,826 -1,702,934 -31.73% x 1.46 tests::big::lookup_random 6,578,814 5,066,531 -1,512,283 -22.99% x 1.30 tests::big::random_gen 760,432 746,856 -13,576 -1.79% x 1.02 tests::big::traversal 5,466,632 3,768,946 -1,697,686 -31.06% x 1.45 tests::large::clone 366,667 285,431 -81,236 -22.16% x 1.28 tests::large::copy_element_wise 1,011,849 675,146 -336,703 -33.28% x 1.50 tests::large::fill_only 331,824 250,706 -81,118 -24.45% x 1.32 tests::large::insert_random 641,432 559,385 -82,047 -12.79% x 1.15 tests::large::lookup_all 361,762 286,708 -75,054 -20.75% x 1.26 tests::large::lookup_missing 506,489 408,327 -98,162 -19.38% x 1.24 tests::large::lookup_one 333,107 251,079 -82,028 -24.63% x 1.33 tests::large::lookup_random 475,511 401,555 -73,956 -15.55% x 1.18 tests::large::random_gen 114,740 112,922 -1,818 -1.58% x 1.02 tests::large::traversal 344,705 263,406 -81,299 -23.59% x 1.31 tests::medium::clone 33,535 30,858 -2,677 -7.98% x 1.09 tests::medium::copy_element_wise 62,100 60,295 -1,805 -2.91% x 1.03 tests::medium::fill_only 30,040 27,596 -2,444 -8.14% x 1.09 tests::medium::insert_random 45,116 41,869 -3,247 -7.20% x 1.08 tests::medium::lookup_all 32,223 31,341 -882 -2.74% x 1.03 tests::medium::lookup_missing 35,241 34,807 -434 -1.23% x 1.01 tests::medium::lookup_one 30,081 27,993 -2,088 -6.94% x 1.07 tests::medium::lookup_random 37,347 35,581 -1,766 -4.73% x 1.05 tests::medium::random_gen 4,144 4,030 -114 -2.75% x 1.03 tests::medium::traversal 31,348 28,976 -2,372 -7.57% x 1.08 tests::small::clone 2,328 2,717 389 16.71% x 0.86 tests::small::copy_element_wise 4,582 4,992 410 8.95% x 0.92 tests::small::fill_only 2,171 2,370 199 9.17% x 0.92 tests::small::insert_random 4,397 4,598 201 4.57% x 0.96 tests::small::lookup_all 2,387 2,701 314 13.15% x 0.88 tests::small::lookup_missing 3,825 3,441 -384 -10.04% x 1.11 tests::small::lookup_one 2,169 2,372 203 9.36% x 0.91 tests::small::lookup_random 3,235 3,524 289 8.93% x 0.92 tests::small::random_gen 744 733 -11 -1.48% x 1.02 tests::small::traversal 2,245 2,448 203 9.04% x 0.92 tests::tiny::clone 247 429 182 73.68% x 0.58 tests::tiny::copy_element_wise 428 753 325 75.93% x 0.57 tests::tiny::fill_only 201 376 175 87.06% x 0.53 tests::tiny::insert_random 503 691 188 37.38% x 0.73 tests::tiny::lookup_all 217 405 188 86.64% x 0.54 tests::tiny::lookup_missing 322 534 212 65.84% x 0.60 tests::tiny::lookup_one 204 384 180 88.24% x 0.53 tests::tiny::lookup_random 355 542 187 52.68% x 0.65 tests::tiny::random_gen 128 128 0 0.00% x 1.00 tests::tiny::traversal 222 377 155 69.82% x 0.59 ```

cmp_reserve

``` name fx_hash_reserve ns/iter hb_map_reserve ns/iter diff ns/iter diff % speedup tests::big::clone 4,754,128 2,956,167 -1,797,961 -37.82% x 1.61 tests::big::copy_element_wise 6,386,798 3,952,174 -2,434,624 -38.12% x 1.62 tests::big::fill_only 3,160,160 1,962,929 -1,197,231 -37.89% x 1.61 tests::big::insert_random 5,534,253 4,527,364 -1,006,889 -18.19% x 1.22 tests::big::lookup_all 3,527,120 2,684,407 -842,713 -23.89% x 1.31 tests::big::lookup_missing 5,306,770 2,979,452 -2,327,318 -43.86% x 1.78 tests::big::lookup_one 3,170,956 1,961,437 -1,209,519 -38.14% x 1.62 tests::big::lookup_random 4,313,456 3,359,152 -954,304 -22.12% x 1.28 tests::big::random_gen 757,628 749,122 -8,506 -1.12% x 1.01 tests::big::traversal 3,256,776 2,066,055 -1,190,721 -36.56% x 1.58 tests::large::clone 259,178 145,420 -113,758 -43.89% x 1.78 tests::large::copy_element_wise 408,293 240,134 -168,159 -41.19% x 1.70 tests::large::fill_only 222,180 109,206 -112,974 -50.85% x 2.03 tests::large::insert_random 536,214 420,179 -116,035 -21.64% x 1.28 tests::large::lookup_all 253,566 145,440 -108,126 -42.64% x 1.74 tests::large::lookup_missing 401,967 263,920 -138,047 -34.34% x 1.52 tests::large::lookup_one 222,645 109,125 -113,520 -50.99% x 2.04 tests::large::lookup_random 372,254 259,761 -112,493 -30.22% x 1.43 tests::large::random_gen 114,193 114,292 99 0.09% x 1.00 tests::large::traversal 235,224 120,938 -114,286 -48.59% x 1.94 tests::medium::clone 20,261 13,350 -6,911 -34.11% x 1.52 tests::medium::copy_element_wise 35,255 23,405 -11,850 -33.61% x 1.51 tests::medium::fill_only 16,409 10,165 -6,244 -38.05% x 1.61 tests::medium::insert_random 30,916 24,358 -6,558 -21.21% x 1.27 tests::medium::lookup_all 18,645 14,055 -4,590 -24.62% x 1.33 tests::medium::lookup_missing 22,384 17,169 -5,215 -23.30% x 1.30 tests::medium::lookup_one 16,279 8,361 -7,918 -48.64% x 1.95 tests::medium::lookup_random 23,889 15,973 -7,916 -33.14% x 1.50 tests::medium::random_gen 4,155 4,028 -127 -3.06% x 1.03 tests::medium::traversal 17,858 9,831 -8,027 -44.95% x 1.82 tests::small::clone 1,845 1,280 -565 -30.62% x 1.44 tests::small::copy_element_wise 3,478 2,399 -1,079 -31.02% x 1.45 tests::small::fill_only 1,689 935 -754 -44.64% x 1.81 tests::small::insert_random 3,905 3,135 -770 -19.72% x 1.25 tests::small::lookup_all 1,940 1,279 -661 -34.07% x 1.52 tests::small::lookup_missing 3,360 1,946 -1,414 -42.08% x 1.73 tests::small::lookup_one 1,699 935 -764 -44.97% x 1.82 tests::small::lookup_random 2,772 2,021 -751 -27.09% x 1.37 tests::small::random_gen 740 734 -6 -0.81% x 1.01 tests::small::traversal 1,797 1,001 -796 -44.30% x 1.80 tests::tiny::clone 247 215 -32 -12.96% x 1.15 tests::tiny::copy_element_wise 460 338 -122 -26.52% x 1.36 tests::tiny::fill_only 203 170 -33 -16.26% x 1.19 tests::tiny::insert_random 517 482 -35 -6.77% x 1.07 tests::tiny::lookup_all 234 206 -28 -11.97% x 1.14 tests::tiny::lookup_missing 340 325 -15 -4.41% x 1.05 tests::tiny::lookup_one 209 175 -34 -16.27% x 1.19 tests::tiny::lookup_random 360 332 -28 -7.78% x 1.08 tests::tiny::random_gen 127 129 2 1.57% x 0.98 tests::tiny::traversal 232 177 -55 -23.71% x 1.31 ```

cmp_string_key

``` name fx_hash_string_key ns/iter hb_map_string_key ns/iter diff ns/iter diff % speedup tests::big::clone 41,640,749 37,231,056 -4,409,693 -10.59% x 1.12 tests::big::copy_element_wise 50,113,165 44,930,224 -5,182,941 -10.34% x 1.12 tests::big::fill_only 31,153,347 22,566,193 -8,587,154 -27.56% x 1.38 tests::big::insert_random 54,839,289 41,885,814 -12,953,475 -23.62% x 1.31 tests::big::lookup_all 44,268,429 34,459,033 -9,809,396 -22.16% x 1.28 tests::big::lookup_missing 42,657,665 32,314,927 -10,342,738 -24.25% x 1.32 tests::big::lookup_one 31,245,774 22,369,691 -8,876,083 -28.41% x 1.40 tests::big::lookup_random 52,480,971 41,227,666 -11,253,305 -21.44% x 1.27 tests::big::random_gen 747,276 752,287 5,011 0.67% x 0.99 tests::big::traversal 31,487,239 22,135,437 -9,351,802 -29.70% x 1.42 tests::large::clone 3,539,273 2,176,247 -1,363,026 -38.51% x 1.63 tests::large::copy_element_wise 4,053,978 2,606,888 -1,447,090 -35.70% x 1.56 tests::large::fill_only 2,931,351 1,650,922 -1,280,429 -43.68% x 1.78 tests::large::insert_random 4,456,544 2,809,220 -1,647,324 -36.96% x 1.59 tests::large::lookup_all 3,938,244 2,454,814 -1,483,430 -37.67% x 1.60 tests::large::lookup_missing 4,069,325 2,541,896 -1,527,429 -37.54% x 1.60 tests::large::lookup_one 2,960,518 1,657,446 -1,303,072 -44.02% x 1.79 tests::large::lookup_random 4,210,819 2,715,891 -1,494,928 -35.50% x 1.55 tests::large::random_gen 114,645 114,111 -534 -0.47% x 1.00 tests::large::traversal 2,971,583 1,662,944 -1,308,639 -44.04% x 1.79 tests::medium::clone 275,756 201,411 -74,345 -26.96% x 1.37 tests::medium::copy_element_wise 336,514 218,863 -117,651 -34.96% x 1.54 tests::medium::fill_only 209,900 152,374 -57,526 -27.41% x 1.38 tests::medium::insert_random 315,836 241,154 -74,682 -23.65% x 1.31 tests::medium::lookup_all 287,676 227,982 -59,694 -20.75% x 1.26 tests::medium::lookup_missing 287,370 224,582 -62,788 -21.85% x 1.28 tests::medium::lookup_one 211,345 154,994 -56,351 -26.66% x 1.36 tests::medium::lookup_random 296,956 236,349 -60,607 -20.41% x 1.26 tests::medium::random_gen 4,022 4,100 78 1.94% x 0.98 tests::medium::traversal 210,894 156,688 -54,206 -25.70% x 1.35 tests::small::clone 21,570 18,787 -2,783 -12.90% x 1.15 tests::small::copy_element_wise 25,194 18,151 -7,043 -27.96% x 1.39 tests::small::fill_only 16,074 13,989 -2,085 -12.97% x 1.15 tests::small::insert_random 27,688 22,734 -4,954 -17.89% x 1.22 tests::small::lookup_all 25,408 20,805 -4,603 -18.12% x 1.22 tests::small::lookup_missing 25,629 21,534 -4,095 -15.98% x 1.19 tests::small::lookup_one 16,161 14,087 -2,074 -12.83% x 1.15 tests::small::lookup_random 26,615 21,665 -4,950 -18.60% x 1.23 tests::small::random_gen 732 737 5 0.68% x 0.99 tests::small::traversal 16,949 13,982 -2,967 -17.51% x 1.21 tests::tiny::clone 1,794 1,653 -141 -7.86% x 1.09 tests::tiny::copy_element_wise 1,686 1,939 253 15.01% x 0.87 tests::tiny::fill_only 1,172 1,322 150 12.80% x 0.89 tests::tiny::insert_random 2,304 2,185 -119 -5.16% x 1.05 tests::tiny::lookup_all 1,842 1,914 72 3.91% x 0.96 tests::tiny::lookup_missing 2,045 2,168 123 6.01% x 0.94 tests::tiny::lookup_one 1,236 1,320 84 6.80% x 0.94 tests::tiny::lookup_random 2,023 2,114 91 4.50% x 0.96 tests::tiny::random_gen 128 128 0 0.00% x 1.00 tests::tiny::traversal 1,162 1,332 170 14.63% x 0.87 ```

cmp_string_value

``` name fx_hash_string_value ns/iter hb_map_string_value ns/iter diff ns/iter diff % speedup tests::big::clone 28,788,708 28,156,111 -632,597 -2.20% x 1.02 tests::big::copy_element_wise 39,861,523 25,823,906 -14,037,617 -35.22% x 1.54 tests::big::fill_only 20,341,366 15,939,241 -4,402,125 -21.64% x 1.28 tests::big::insert_random 38,308,105 34,449,390 -3,858,715 -10.07% x 1.11 tests::big::lookup_all 21,341,971 17,606,206 -3,735,765 -17.50% x 1.21 tests::big::lookup_missing 22,491,193 16,692,648 -5,798,545 -25.78% x 1.35 tests::big::lookup_one 19,936,818 15,639,018 -4,297,800 -21.56% x 1.27 tests::big::lookup_random 22,219,383 18,204,456 -4,014,927 -18.07% x 1.22 tests::big::random_gen 754,522 757,921 3,399 0.45% x 1.00 tests::big::traversal 19,329,821 16,025,152 -3,304,669 -17.10% x 1.21 tests::large::clone 2,296,144 1,914,849 -381,295 -16.61% x 1.20 tests::large::copy_element_wise 6,338,369 1,806,096 -4,532,273 -71.51% x 3.51 tests::large::fill_only 1,669,212 1,324,006 -345,206 -20.68% x 1.26 tests::large::insert_random 2,672,227 2,383,659 -288,568 -10.80% x 1.12 tests::large::lookup_all 1,702,581 1,377,023 -325,558 -19.12% x 1.24 tests::large::lookup_missing 1,850,643 1,474,750 -375,893 -20.31% x 1.25 tests::large::lookup_one 1,668,583 1,319,367 -349,216 -20.93% x 1.26 tests::large::lookup_random 1,819,394 1,490,359 -329,035 -18.08% x 1.22 tests::large::random_gen 113,680 113,100 -580 -0.51% x 1.01 tests::large::traversal 1,665,136 1,760,164 95,028 5.71% x 0.95 tests::medium::clone 216,942 169,949 -46,993 -21.66% x 1.28 tests::medium::copy_element_wise 265,627 158,921 -106,706 -40.17% x 1.67 tests::medium::fill_only 160,125 119,402 -40,723 -25.43% x 1.34 tests::medium::insert_random 247,924 201,818 -46,106 -18.60% x 1.23 tests::medium::lookup_all 168,105 122,051 -46,054 -27.40% x 1.38 tests::medium::lookup_missing 174,783 126,260 -48,523 -27.76% x 1.38 tests::medium::lookup_one 165,979 119,581 -46,398 -27.95% x 1.39 tests::medium::lookup_random 172,994 129,014 -43,980 -25.42% x 1.34 tests::medium::random_gen 4,050 4,106 56 1.38% x 0.99 tests::medium::traversal 165,668 120,305 -45,363 -27.38% x 1.38 tests::small::clone 18,834 17,412 -1,422 -7.55% x 1.08 tests::small::copy_element_wise 21,553 14,849 -6,704 -31.10% x 1.45 tests::small::fill_only 13,707 11,626 -2,081 -15.18% x 1.18 tests::small::insert_random 22,988 20,431 -2,557 -11.12% x 1.13 tests::small::lookup_all 13,917 11,934 -1,983 -14.25% x 1.17 tests::small::lookup_missing 15,470 12,674 -2,796 -18.07% x 1.22 tests::small::lookup_one 13,677 11,738 -1,939 -14.18% x 1.17 tests::small::lookup_random 14,795 13,108 -1,687 -11.40% x 1.13 tests::small::random_gen 742 735 -7 -0.94% x 1.01 tests::small::traversal 13,865 11,920 -1,945 -14.03% x 1.16 tests::tiny::clone 1,772 1,379 -393 -22.18% x 1.28 tests::tiny::copy_element_wise 1,611 1,382 -229 -14.21% x 1.17 tests::tiny::fill_only 1,172 1,019 -153 -13.05% x 1.15 tests::tiny::insert_random 2,291 1,894 -397 -17.33% x 1.21 tests::tiny::lookup_all 1,150 1,033 -117 -10.17% x 1.11 tests::tiny::lookup_missing 1,325 1,170 -155 -11.70% x 1.13 tests::tiny::lookup_one 1,213 1,055 -158 -13.03% x 1.15 tests::tiny::lookup_random 1,357 1,207 -150 -11.05% x 1.12 tests::tiny::random_gen 130 130 0 0.00% x 1.00 tests::tiny::traversal 1,222 1,006 -216 -17.68% x 1.21 ```

cmp_string_key_string_value

``` name fx_hash_string_key_string_value ns/iter hb_map_string_key_string_value ns/iter diff ns/iter diff % speedup tests::big::clone 74,760,564 72,572,547 -2,188,017 -2.93% x 1.03 tests::big::copy_element_wise 74,554,969 78,397,734 3,842,765 5.15% x 0.95 tests::big::fill_only 50,268,783 43,268,444 -7,000,339 -13.93% x 1.16 tests::big::insert_random 89,918,488 80,498,378 -9,420,110 -10.48% x 1.12 tests::big::lookup_all 64,704,076 58,354,002 -6,350,074 -9.81% x 1.11 tests::big::lookup_missing 61,851,802 55,440,549 -6,411,253 -10.37% x 1.12 tests::big::lookup_one 50,035,424 43,687,211 -6,348,213 -12.69% x 1.15 tests::big::lookup_random 74,474,298 69,774,102 -4,700,196 -6.31% x 1.07 tests::big::random_gen 748,833 749,853 1,020 0.14% x 1.00 tests::big::traversal 52,973,648 44,830,385 -8,143,263 -15.37% x 1.18 tests::large::clone 4,975,082 3,619,651 -1,355,431 -27.24% x 1.37 tests::large::copy_element_wise 5,270,695 3,591,497 -1,679,198 -31.86% x 1.47 tests::large::fill_only 3,833,030 2,536,580 -1,296,450 -33.82% x 1.51 tests::large::insert_random 6,218,582 4,559,051 -1,659,531 -26.69% x 1.36 tests::large::lookup_all 4,842,136 3,382,923 -1,459,213 -30.14% x 1.43 tests::large::lookup_missing 4,878,035 3,444,655 -1,433,380 -29.38% x 1.42 tests::large::lookup_one 3,848,832 2,552,879 -1,295,953 -33.67% x 1.51 tests::large::lookup_random 5,138,614 3,686,649 -1,451,965 -28.26% x 1.39 tests::large::random_gen 114,083 113,766 -317 -0.28% x 1.00 tests::large::traversal 4,070,807 2,635,759 -1,435,048 -35.25% x 1.54 tests::medium::clone 437,972 335,931 -102,041 -23.30% x 1.30 tests::medium::copy_element_wise 487,463 308,724 -178,739 -36.67% x 1.58 tests::medium::fill_only 321,994 235,168 -86,826 -26.97% x 1.37 tests::medium::insert_random 500,262 392,281 -107,981 -21.58% x 1.28 tests::medium::lookup_all 402,767 312,075 -90,692 -22.52% x 1.29 tests::medium::lookup_missing 403,073 306,051 -97,022 -24.07% x 1.32 tests::medium::lookup_one 322,310 236,379 -85,931 -26.66% x 1.36 tests::medium::lookup_random 413,442 323,796 -89,646 -21.68% x 1.28 tests::medium::random_gen 4,183 4,067 -116 -2.77% x 1.03 tests::medium::traversal 341,593 243,975 -97,618 -28.58% x 1.40 tests::small::clone 36,063 31,253 -4,810 -13.34% x 1.15 tests::small::copy_element_wise 36,840 26,096 -10,744 -29.16% x 1.41 tests::small::fill_only 25,279 21,553 -3,726 -14.74% x 1.17 tests::small::insert_random 44,857 36,801 -8,056 -17.96% x 1.22 tests::small::lookup_all 34,446 28,472 -5,974 -17.34% x 1.21 tests::small::lookup_missing 36,286 29,268 -7,018 -19.34% x 1.24 tests::small::lookup_one 25,894 21,539 -4,355 -16.82% x 1.20 tests::small::lookup_random 36,469 29,525 -6,944 -19.04% x 1.24 tests::small::random_gen 734 734 0 0.00% x 1.00 tests::small::traversal 28,355 22,041 -6,314 -22.27% x 1.29 tests::tiny::clone 3,051 2,420 -631 -20.68% x 1.26 tests::tiny::copy_element_wise 2,699 2,270 -429 -15.89% x 1.19 tests::tiny::fill_only 1,989 1,730 -259 -13.02% x 1.15 tests::tiny::insert_random 3,781 3,245 -536 -14.18% x 1.17 tests::tiny::lookup_all 2,741 2,424 -317 -11.57% x 1.13 tests::tiny::lookup_missing 2,890 2,591 -299 -10.35% x 1.12 tests::tiny::lookup_one 2,206 1,839 -367 -16.64% x 1.20 tests::tiny::lookup_random 2,927 2,565 -362 -12.37% x 1.14 tests::tiny::random_gen 127 130 3 2.36% x 0.98 tests::tiny::traversal 2,211 1,753 -458 -20.71% x 1.26 ```

cmp_string_pad_string_key

``` name fx_hash_string_pad_string_key ns/iter hb_map_string_pad_string_key ns/iter diff ns/iter diff % speedup tests::big::clone 41,272,108 39,915,804 -1,356,304 -3.29% x 1.03 tests::big::copy_element_wise 47,931,060 47,404,488 -526,572 -1.10% x 1.01 tests::big::fill_only 33,041,000 25,065,857 -7,975,143 -24.14% x 1.32 tests::big::insert_random 59,294,617 48,454,355 -10,840,262 -18.28% x 1.22 tests::big::lookup_all 50,607,791 42,484,663 -8,123,128 -16.05% x 1.19 tests::big::lookup_missing 46,996,493 37,527,247 -9,469,246 -20.15% x 1.25 tests::big::lookup_one 33,313,558 24,501,042 -8,812,516 -26.45% x 1.36 tests::big::lookup_random 56,667,651 48,438,774 -8,228,877 -14.52% x 1.17 tests::big::random_gen 747,711 752,593 4,882 0.65% x 0.99 tests::big::traversal 33,510,878 24,669,835 -8,841,043 -26.38% x 1.36 tests::large::clone 3,127,811 2,494,388 -633,423 -20.25% x 1.25 tests::large::copy_element_wise 4,233,900 3,043,392 -1,190,508 -28.12% x 1.39 tests::large::fill_only 2,446,574 1,938,230 -508,344 -20.78% x 1.26 tests::large::insert_random 4,096,627 3,428,168 -668,459 -16.32% x 1.19 tests::large::lookup_all 3,681,006 3,091,176 -589,830 -16.02% x 1.19 tests::large::lookup_missing 3,744,526 3,071,961 -672,565 -17.96% x 1.22 tests::large::lookup_one 2,455,870 1,926,694 -529,176 -21.55% x 1.27 tests::large::lookup_random 3,865,338 3,343,224 -522,114 -13.51% x 1.16 tests::large::random_gen 113,971 113,770 -201 -0.18% x 1.00 tests::large::traversal 2,464,109 1,930,872 -533,237 -21.64% x 1.28 tests::medium::clone 290,379 233,367 -57,012 -19.63% x 1.24 tests::medium::copy_element_wise 338,738 278,766 -59,972 -17.70% x 1.22 tests::medium::fill_only 229,274 184,658 -44,616 -19.46% x 1.24 tests::medium::insert_random 372,288 308,971 -63,317 -17.01% x 1.20 tests::medium::lookup_all 343,882 299,799 -44,083 -12.82% x 1.15 tests::medium::lookup_missing 339,559 290,781 -48,778 -14.37% x 1.17 tests::medium::lookup_one 230,504 187,103 -43,401 -18.83% x 1.23 tests::medium::lookup_random 351,303 306,155 -45,148 -12.85% x 1.15 tests::medium::random_gen 4,091 4,125 34 0.83% x 0.99 tests::medium::traversal 229,764 187,875 -41,889 -18.23% x 1.22 tests::small::clone 24,523 22,276 -2,247 -9.16% x 1.10 tests::small::copy_element_wise 26,569 23,911 -2,658 -10.00% x 1.11 tests::small::fill_only 19,684 17,451 -2,233 -11.34% x 1.13 tests::small::insert_random 35,241 29,824 -5,417 -15.37% x 1.18 tests::small::lookup_all 31,017 27,736 -3,281 -10.58% x 1.12 tests::small::lookup_missing 31,904 28,895 -3,009 -9.43% x 1.10 tests::small::lookup_one 19,634 17,228 -2,406 -12.25% x 1.14 tests::small::lookup_random 32,051 28,830 -3,221 -10.05% x 1.11 tests::small::random_gen 734 741 7 0.95% x 0.99 tests::small::traversal 19,894 17,342 -2,552 -12.83% x 1.15 tests::tiny::clone 2,149 1,854 -295 -13.73% x 1.16 tests::tiny::copy_element_wise 1,941 2,182 241 12.42% x 0.89 tests::tiny::fill_only 1,474 1,491 17 1.15% x 0.99 tests::tiny::insert_random 2,831 2,759 -72 -2.54% x 1.03 tests::tiny::lookup_all 2,378 2,517 139 5.85% x 0.94 tests::tiny::lookup_missing 2,576 2,626 50 1.94% x 0.98 tests::tiny::lookup_one 1,549 1,597 48 3.10% x 0.97 tests::tiny::lookup_random 2,570 2,679 109 4.24% x 0.96 tests::tiny::random_gen 128 129 1 0.78% x 0.99 tests::tiny::traversal 1,522 1,532 10 0.66% x 0.99 ```

cmp_string_pad_string_value

``` name fx_hash_string_pad_string_value ns/iter hb_map_string_pad_string_value ns/iter diff ns/iter diff % speedup tests::big::clone 31,074,259 30,330,383 -743,876 -2.39% x 1.02 tests::big::copy_element_wise 42,554,324 28,954,902 -13,599,422 -31.96% x 1.47 tests::big::fill_only 24,946,656 18,672,641 -6,274,015 -25.15% x 1.34 tests::big::insert_random 46,814,633 41,961,645 -4,852,988 -10.37% x 1.12 tests::big::lookup_all 24,548,355 20,463,636 -4,084,719 -16.64% x 1.20 tests::big::lookup_missing 26,458,442 19,633,151 -6,825,291 -25.80% x 1.35 tests::big::lookup_one 25,317,900 18,534,411 -6,783,489 -26.79% x 1.37 tests::big::lookup_random 24,982,509 20,742,252 -4,240,257 -16.97% x 1.20 tests::big::random_gen 748,150 768,670 20,520 2.74% x 0.97 tests::big::traversal 24,895,937 18,558,928 -6,337,009 -25.45% x 1.34 tests::large::clone 2,546,891 2,225,389 -321,502 -12.62% x 1.14 tests::large::copy_element_wise 6,602,670 2,090,669 -4,512,001 -68.34% x 3.16 tests::large::fill_only 1,911,649 1,573,061 -338,588 -17.71% x 1.22 tests::large::insert_random 3,213,892 2,934,292 -279,600 -8.70% x 1.10 tests::large::lookup_all 1,956,880 1,638,664 -318,216 -16.26% x 1.19 tests::large::lookup_missing 2,102,272 1,746,804 -355,468 -16.91% x 1.20 tests::large::lookup_one 1,917,841 1,590,708 -327,133 -17.06% x 1.21 tests::large::lookup_random 2,070,013 1,740,617 -329,396 -15.91% x 1.19 tests::large::random_gen 113,749 114,620 871 0.77% x 0.99 tests::large::traversal 1,919,731 1,594,213 -325,518 -16.96% x 1.20 tests::medium::clone 245,905 197,544 -48,361 -19.67% x 1.24 tests::medium::copy_element_wise 292,915 185,138 -107,777 -36.79% x 1.58 tests::medium::fill_only 188,804 142,932 -45,872 -24.30% x 1.32 tests::medium::insert_random 300,795 255,091 -45,704 -15.19% x 1.18 tests::medium::lookup_all 193,627 147,859 -45,768 -23.64% x 1.31 tests::medium::lookup_missing 199,230 151,758 -47,472 -23.83% x 1.31 tests::medium::lookup_one 190,595 145,656 -44,939 -23.58% x 1.31 tests::medium::lookup_random 198,991 151,738 -47,253 -23.75% x 1.31 tests::medium::random_gen 4,125 4,225 100 2.42% x 0.98 tests::medium::traversal 189,972 145,092 -44,880 -23.62% x 1.31 tests::small::clone 21,990 20,150 -1,840 -8.37% x 1.09 tests::small::copy_element_wise 24,618 18,084 -6,534 -26.54% x 1.36 tests::small::fill_only 16,559 14,775 -1,784 -10.77% x 1.12 tests::small::insert_random 28,461 26,172 -2,289 -8.04% x 1.09 tests::small::lookup_all 17,859 15,087 -2,772 -15.52% x 1.18 tests::small::lookup_missing 19,293 15,817 -3,476 -18.02% x 1.22 tests::small::lookup_one 17,866 14,861 -3,005 -16.82% x 1.20 tests::small::lookup_random 17,916 16,016 -1,900 -10.61% x 1.12 tests::small::random_gen 735 749 14 1.90% x 0.98 tests::small::traversal 17,061 14,718 -2,343 -13.73% x 1.16 tests::tiny::clone 1,948 1,664 -284 -14.58% x 1.17 tests::tiny::copy_element_wise 1,732 1,674 -58 -3.35% x 1.03 tests::tiny::fill_only 1,256 1,284 28 2.23% x 0.98 tests::tiny::insert_random 2,576 2,450 -126 -4.89% x 1.05 tests::tiny::lookup_all 1,307 1,337 30 2.30% x 0.98 tests::tiny::lookup_missing 1,460 1,464 4 0.27% x 1.00 tests::tiny::lookup_one 1,342 1,311 -31 -2.31% x 1.02 tests::tiny::lookup_random 1,477 1,497 20 1.35% x 0.99 tests::tiny::random_gen 129 128 -1 -0.78% x 1.01 tests::tiny::traversal 1,373 1,297 -76 -5.54% x 1.06 ```

cmp_string_pad_string_key_string_value

``` name fx_hash_string_pad_string_key_string_value ns/iter hb_map_string_pad_string_key_string_value ns/iter diff ns/iter diff % speedup tests::big::clone 68,565,017 72,475,248 3,910,231 5.70% x 0.95 tests::big::copy_element_wise 65,421,891 75,619,984 10,198,093 15.59% x 0.87 tests::big::fill_only 49,130,177 45,038,534 -4,091,643 -8.33% x 1.09 tests::big::insert_random 93,900,269 84,954,632 -8,945,637 -9.53% x 1.11 tests::big::lookup_all 68,206,790 64,794,378 -3,412,412 -5.00% x 1.05 tests::big::lookup_missing 62,777,908 57,459,080 -5,318,828 -8.47% x 1.09 tests::big::lookup_one 49,071,165 45,036,272 -4,034,893 -8.22% x 1.09 tests::big::lookup_random 75,054,379 71,623,767 -3,430,612 -4.57% x 1.05 tests::big::random_gen 764,731 749,908 -14,823 -1.94% x 1.02 tests::big::traversal 51,439,727 45,364,081 -6,075,646 -11.81% x 1.13 tests::large::clone 5,004,835 4,213,762 -791,073 -15.81% x 1.19 tests::large::copy_element_wise 5,447,282 4,220,486 -1,226,796 -22.52% x 1.29 tests::large::fill_only 3,775,658 3,097,579 -678,079 -17.96% x 1.22 tests::large::insert_random 6,499,488 5,580,395 -919,093 -14.14% x 1.16 tests::large::lookup_all 5,040,883 4,317,939 -722,944 -14.34% x 1.17 tests::large::lookup_missing 5,042,517 4,222,034 -820,483 -16.27% x 1.19 tests::large::lookup_one 3,780,753 3,113,636 -667,117 -17.65% x 1.21 tests::large::lookup_random 5,258,342 4,543,783 -714,559 -13.59% x 1.16 tests::large::random_gen 114,543 114,491 -52 -0.05% x 1.00 tests::large::traversal 3,992,131 3,183,968 -808,163 -20.24% x 1.25 tests::medium::clone 476,912 394,278 -82,634 -17.33% x 1.21 tests::medium::copy_element_wise 518,023 375,221 -142,802 -27.57% x 1.38 tests::medium::fill_only 364,070 294,470 -69,600 -19.12% x 1.24 tests::medium::insert_random 597,360 505,532 -91,828 -15.37% x 1.18 tests::medium::lookup_all 480,594 403,716 -76,878 -16.00% x 1.19 tests::medium::lookup_missing 478,492 394,552 -83,940 -17.54% x 1.21 tests::medium::lookup_one 365,828 294,687 -71,141 -19.45% x 1.24 tests::medium::lookup_random 492,067 413,467 -78,600 -15.97% x 1.19 tests::medium::random_gen 4,281 4,056 -225 -5.26% x 1.06 tests::medium::traversal 385,080 305,391 -79,689 -20.69% x 1.26 tests::small::clone 40,645 37,194 -3,451 -8.49% x 1.09 tests::small::copy_element_wise 41,909 33,237 -8,672 -20.69% x 1.26 tests::small::fill_only 31,423 27,422 -4,001 -12.73% x 1.15 tests::small::insert_random 55,027 48,787 -6,240 -11.34% x 1.13 tests::small::lookup_all 42,130 37,757 -4,373 -10.38% x 1.12 tests::small::lookup_missing 43,490 38,896 -4,594 -10.56% x 1.12 tests::small::lookup_one 31,155 27,891 -3,264 -10.48% x 1.12 tests::small::lookup_random 43,896 39,166 -4,730 -10.78% x 1.12 tests::small::random_gen 743 735 -8 -1.08% x 1.01 tests::small::traversal 33,762 28,222 -5,540 -16.41% x 1.20 tests::tiny::clone 3,634 3,012 -622 -17.12% x 1.21 tests::tiny::copy_element_wise 3,195 2,929 -266 -8.33% x 1.09 tests::tiny::fill_only 2,490 2,328 -162 -6.51% x 1.07 tests::tiny::insert_random 4,668 4,311 -357 -7.65% x 1.08 tests::tiny::lookup_all 3,453 3,309 -144 -4.17% x 1.04 tests::tiny::lookup_missing 3,601 3,425 -176 -4.89% x 1.05 tests::tiny::lookup_one 2,632 2,429 -203 -7.71% x 1.08 tests::tiny::lookup_random 3,601 3,453 -148 -4.11% x 1.04 tests::tiny::random_gen 128 127 -1 -0.78% x 1.01 tests::tiny::traversal 2,700 2,342 -358 -13.26% x 1.15 ```

arthurprs commented 5 years ago

To be clear I'm not saying it's going to be slower or anything like that. We just need to look into the worst case.

michaelwoerister commented 5 years ago

These results look pretty good!

Is there a reason why the new hashmap impl can't cache hash values?

Gankra commented 5 years ago

One of the major improvements of hashbrown over our current design is to truncate the hash array to be byte entries instead of 8 bytes. This significantly improves cache locality of the hash search under high load factors, reduces the memory footprint of all our hashtables, and allows 4, 8, or 16 hashes to be compared at once (on 32-bit, 64-bit, and sse2 platforms respectively).

This comes at the cost of only storing 7 bits of the hash (you still need one bit for empty/full). So in theory we could avoid rehashing up to a capacity of 128, but past that we need to rehash to retrieve the higher bits.