Performance improvements for BytesRefHash

ketanv3 commented 10 months ago

Description

Performance improvements for BytesRefHash which includes:

Remove the extra bit-mixer step (here) now that BytesRef uses a stronger hash function (here).
Use larger initial capacity (1 → 32) to avoid frequent resizing.
Use fingerprint to short-circuit equality checks during linear probing caused by collisions.
Re-organize hash table values to minimize the longest probe sequence length.
Refactor code to avoid unnecessary paths, branches, and defensive checks in the hot path.

Related Issues

Meta issue: #8710

JMH Benchmarks

These results indicate the time to perform 1 million add(...) operations with varying number of unique keys. These operations were repeated and interleaved between 20 hash tables in order to make caches less effective. The following implementations are being compared:

(A) is the baseline implementation.
(B) adds fingerprinting, avoids unnecessary paths, branches, and defensive checks.
(C) uses a faster hash function on top of (B).

Peak improvement

Key length	A	B	C
8 bytes	–	-33.32%	-37.47%
32 bytes	–	-34.55%	-49.00%
128 bytes	–	-21.58%	-63.92%

Whether or not to minimize the longest probe sequence length

The following implementations are being compared:

(A) CompactBytesRefHash which performs simple linear probing for inserts.
(B) ReorganizingBytesRefHash which re-organizes the hash table values during inserts to minimize the longest PSL.

(A) is simpler and marginally faster for insert-heavy workloads. It also packs 32-bits of fingerprint information which is 65,536 times of (B), but false positives are already rare with (B). On the other hand, (B) provides marginally faster worst-case lookups due to better use of CPU cache lines.

Benchmarks show no appreciable performance difference between (A) and (B) since the latency is dominated by the time to copy the key.

Check List

[x] New functionality includes testing.
- [x] All tests pass
[x] New functionality has been documented.
- [x] New functionality has javadoc added
[x] Commits are signed per the DCO using --signoff
[x] Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/20544/
CommitID: 0f9171f01ad44764b161e6ff3fd9766159f6aa03 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/20547/
CommitID: 8afa9ab6db5fe5cc6567079d38848fe98222e77d Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE :grey_exclamation:

TEST FAILURES:

  1 org.opensearch.index.translog.RemoteFSTranslogTests.testMetadataFileDeletion

URL: https://build.ci.opensearch.org/job/gradle-check/20548/
CommitID: f3ee896d3b9dc63723014e4485f0134ec4b1ca03 Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

codecov[bot] commented 10 months ago

Codecov Report

Merging #8788 (58d3394) into main (a4024e7) will increase coverage by 0.14%. The diff coverage is 90.14%.

@@             Coverage Diff              @@
##               main    #8788      +/-   ##
============================================
+ Coverage     71.01%   71.15%   +0.14%     
- Complexity    57420    57497      +77     
============================================
  Files          4778     4779       +1     
  Lines        270922   270995      +73     
  Branches      39585    39589       +4     
============================================
+ Hits         192403   192840     +437     
+ Misses        62338    61938     -400     
- Partials      16181    16217      +36

Files Changed	Coverage Δ
...g/opensearch/common/util/ReorganizingLongHash.java	`93.65% <75.00%> (-1.67%)`	:arrow_down:
.../java/org/opensearch/common/util/BytesRefHash.java	`89.18% <89.70%> (+1.31%)`	:arrow_up:
...rc/main/java/org/opensearch/common/hash/T1ha1.java	`90.76% <90.76%> (ø)`
...n/src/main/java/org/opensearch/common/Numbers.java	`89.69% <100.00%> (+1.14%)`	:arrow_up:
...ggregations/bucket/terms/BytesKeyedBucketOrds.java	`87.50% <100.00%> (ø)`
.../aggregations/bucket/terms/SignificanceLookup.java	`96.49% <100.00%> (ø)`
...ations/bucket/terms/StringRareTermsAggregator.java	`92.42% <100.00%> (ø)`

... and 464 files with indirect coverage changes

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE :grey_exclamation:

TEST FAILURES:

  1 org.opensearch.remotestore.RemoteStoreIT.testStaleCommitDeletionWithInvokeFlush

URL: https://build.ci.opensearch.org/job/gradle-check/20602/
CommitID: 67ee51c31a90f15004bcbc52b45c742bd64a70da Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

ketanv3 commented 10 months ago

@backslasht @dblock Need your inputs!

Please check the section "Whether or not to minimize the longest probe sequence length". I'm suggesting (A) as it's simpler, and marginal lookup improvements in (B) are dwarfed by expensive "equals" check which compares two keys byte-by-byte. This isn't the case with ReorganizingLongHash where equality check simply meant comparing two longs.
I'm also suggesting to completely replace the existing BytesRefHash instead of keeping two implementations (like before). Regardless of our choice between (A) and (B), both are better across the board over the existing implementation.

What do you think?

backslasht commented 10 months ago

@ketanv3 - Would the benefit of CPU cache lines applicable when large numbers of keys are used in ReorganizingBytesRefHash and also assuming the keys arrive out of order. Considering that is the worst case would it still out perform CompactBytesRefHash?

ketanv3 commented 10 months ago

@ketanv3 - Would the benefit of CPU cache lines applicable when large numbers of keys are used in ReorganizingLongHash and also assuming the keys arrive out of order. Considering that is the worst case would it still out perform CompactBytesRefHash?

Recency criteria (i.e. correlated adds) isn't considered for BytesRefHash; it made more sense for LongHash where similar timestamps across consecutive hits made that optimization possible. Only the PSL criteria is considered which makes CPU caches more effective. Performance should be similar no matter whether keys arrive in or out of order.

Theoretically, ReorganizingBytesRefHash should perform better than CompactBytesRefHash for lookups at the slight expense of inserts. But another factor to consider is the difference in byte packing:

CompactBytesRefHash uses [32 fingerprint, 32 ordinal] bits.
ReorganizingBytesRefHash uses [1 discard, 15 PSL, 16 fingerprint, 32 ordinal] bits.

Since the number of fingerprint bits is more in (1) (over 65K times more), it's not a 1-1 comparison between the two. With the current byte-packing scheme, performance is balanced between the two. But we can experiment with [1 discard, 7 PSL, 24 fingerprint, 32 ordinal] bits to see if this helps us speed up lookups even more.

ketanv3 commented 10 months ago

@backslasht Here's some follow up:

I wrote an alternative benchmark that highlights the overhead of "new inserts" and the improvement it brings to "subsequent lookups". I ran it for large table sizes (10K to 100K) interleaved across 50 hash tables.

Insert – time to insert N unique keys across all hash tables.
Lookup – time to lookup those N unique keys across all tables.

On average, pure inserts were 8.68% slower and pure lookups were 3.47% faster when re-organizing. Inserts were about 30% more expensive than lookups, so a key needs to be repeated at least 4 times for the lookup improvements to compensate for the insertion overhead.

Note: Take these numbers with a grain of salt; I faced a lot of variance while benchmarking these large tables.

In hindsight, this overhead wasn't noticeable with ReorganizingLongHash as the performance was being compared to the baseline implementation. Instead, here our comparison is against CompactBytesRefHash which itself has several enhancements on top of the baseline implementation (better locality, reduced branching, fingerprinting, etc.). Would be interesting to see how an equivalent CompactLongHash stacks up (and possibly replace LongHash).

backslasht commented 10 months ago

Thanks @ketanv3 for the experiments.

I think we can go with CompactBytesRefHash for the below reasons.

Given the number of operations for each key is going to be 2 (1 insert and 1 lookup), the benefit CompactBytesRefHash provides during insertion is marginally better than the benefit ReorganizingBytesRefHash provides during the lookup. Also, implementation wise, CompactBytesRefHash is relatively simpler when compared to ReorganizingBytesRefHash.

ketanv3 commented 10 months ago

Given the number of operations for each key is going to be 2 (1 insert and 1 lookup) ...

@backslasht This isn't fully true; let me explain why. Several hits may produce the same key, so during aggregation, the BytesRefHash::add(key) method will simply return the ordinal of the key when it was first added (i.e. a lookup). In other words, the number of buckets should be much less than the number of hits (i.e. lookups >> inserts) for re-organization to be effective.

To summarise:

Inserts are always more expensive.
For small tables (< 10K keys), lookups are identical as the worst-case PSL isn't too high to begin with.
For large tables, lookups perform better and may compensate for the extra overhead of inserts if keys are repeated enough times.

~This presents me with an idea: what if we re-organize only when the table grows? It will give us a permutation that sits between simple linear probing and Robin Hood hashing.~ On a second thought, this wont help us either. Growing the table cuts down the load factor in half, thus, breaking down the clusters and reducing the PSL. A more complicated approach would be to measure "how many times was a key inserted too far away from its home slot", and then use that heuristic to trigger re-organization. But all this to gain only a little. Let's stick with CompactBytesRefHash! :)

backslasht commented 10 months ago

Thanks for the detailed explanation @ketanv3

I agree, it boils down to size of the table and the number of repeated keys where ReorganizingBytesRefHash performs marginally better. Looking at the additional complexity and very marginal benefits ReorganizingBytesRefHash brings, it makes sense to stick to CompactBytesRefHash and revisit ReorganizingBytesRefHash in future when compelling use case arise.

ketanv3 commented 10 months ago

@dblock Would like to know your thoughts too.

dblock commented 10 months ago

@dblock Would like to know your thoughts too.

I am a bit out of my depth here - would love to see you and @backslasht agree on what should be a mergable state of this PR, first. If you need someone to break a tie I can give it a try for sure!

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/21369/
CommitID: 88704194bef26b8b83d0166d6938778f4b831fd4 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions[bot] commented 10 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS :white_check_mark:
URL: https://build.ci.opensearch.org/job/gradle-check/21374/
CommitID: 6ca2eb13cf4909603d5a100fc68eb705690f4264

ketanv3 commented 10 months ago

@backslasht @dblock Ready for review!

ketanv3 commented 9 months ago

@reta @backslasht

Here's a detailed performance analysis of portable, fast, non-cryptographic hash functions from well-known libraries: Zero-Allocation-Hashing, xxhash-java, and hash4j.

Comparison between variants of xxHash

xxh3 performs better with small inputs (suitable for hash tables), but xxhash64 performs better with large inputs (suitable for digests).
The construction of xxhash64 makes it very unstable on small inputs (sawtooth pattern).
lz4 library's implementation performed poorly, even worse than the baseline for key length less than 256 bytes.
Using VarHandles was about 15% slower for small keys and up to 40% slower for large keys.
OpenHFT's xxh3 implementation internally uses Unsafe to access pre-defined constants even when we specify a VarHandle based accessor. It defeats the purpose of using VarHandles. :(

Winner: xxh3; assuming the internal usage of Unsafe is removed.

Comparison between other "safe" hash functions

FarmHash (successor to CityHash) has the best performance on small inputs, closely followed by wyhash. Due to their construction differences, the former's performance drops in steps, while the latter's performance drop is more gradual when the key length increases.
MetroHash and xxhash64 have the best performance on keys > 1000 bytes in size, closely followed by FarmHash and wyhash.

Winner: FarmHash and wyhash

Comparison between variants of wyhash

The difference between OpenHFT's Unsafe and VarHandle was marginal (not sure why?).
Hash4J's implementation performed better for keys < 1000 bytes, but slightly worse on larger keys. This was due to slight differences in their implementation: the former eagerly short-circuits on smaller keys.

Winner: Hash4J

Komihash

This is a relatively new hash function featuring both high large-block hashing performance and a high hashing throughput for small inputs, mainly designed for hash-tables. It is close to the class of hash functions like wyhash, so identical performance is not surprising.

Latency benchmarks

The raw performance results can be misleading, especially on small inputs. It may seem that the baseline is 2x faster than FarmHash for 1 byte keys, but it would only result in an overhead of 4 ms per 1 million keys hashed. Here's how the latency improvements scale with input size.

Latency improvement to hash 1 million keys wrt input size:

1 byte = -4 ms
10 bytes = 2 ms
100 bytes = 18 ms
1000 bytes = 323 ms
10,000 bytes = 3490 ms

Summary

I found FarmHash, wyhash, and komihash to have suitable performance for use in a hash-table. They are simple in construction, instruction-cache friendly, portable, and fast on small inputs, and second best on large inputs. They are well-known hash functions that produce an avalanche effect, so I haven't tested their quality (i.e. collisions in the hash table).

reta commented 9 months ago

Here's a detailed performance analysis of portable, fast, non-cryptographic hash functions from well-known libraries: Zero-Allocation-Hashing, xxhash-java, and hash4j.

Thanks @ketanv3 , very comprehensive and educating analysis, I am wondering we you still have the measure data to check the GC allocation activity (if you don't, not a big deal). Regarding hash4j/komihash, OpenNFT/FarmHash, OpenNFT/wyhash, I think we would need to collect a bit more data for these three regarding GC allocations (OpenNFT is notoriously good at not generating too much garbage). What do you think?

ketanv3 commented 9 months ago

Hi @reta, here a comparison of "gc.alloc.rate.norm" at different input lengths. The baseline/lz4 hash functions have significantly higher allocations, but between OpenHFT and hash4j it's a tie. Despite, the absolute values for all of them are quite low for GC to be a concern.

Key length (bytes)	1	10	100	1000	10,000	100,000	1,000,000
BASELINE_MURMUR3	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.001 B/op	0.008 B/op	6.987 B/op
LZ4_XXHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.001 B/op	0.011 B/op	9.350 B/op
OPENHFT_UNSAFE_XXHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.001 B/op	0.016 B/op
OPENHFT_VH_XXHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.021 B/op
OPENHFT_UNSAFE_XXH3	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁴ B/op	≈ 10⁻⁴ B/op	0.004 B/op	0.030 B/op
OPENHFT_VH_XXH3	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.001 B/op	0.006 B/op	0.052 B/op
OPENHFT_UNSAFE_FARMHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.021 B/op
OPENHFT_VH_FARMHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.003 B/op	0.025 B/op
OPENHFT_UNSAFE_WYHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.003 B/op	0.025 B/op
OPENHFT_VH_WYHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁴ B/op	≈ 10⁻³ B/op	0.004 B/op	0.036 B/op
HASH4J_WYHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.019 B/op
OPENHFT_UNSAFE_METROHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.016 B/op
OPENHFT_VH_METROHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.021 B/op
HASH4J_KOMIHASH	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁵ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.003 B/op	0.028 B/op

Detailed results

``` Benchmark (length) (type) Mode Cnt Score Error Units HasherBenchmark.hash 1 BASELINE_MURMUR3 thrpt 431216932.362 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 BASELINE_MURMUR3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 1 LZ4_XXHASH thrpt 330014143.816 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 LZ4_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_UNSAFE_XXHASH thrpt 356146972.517 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_VH_XXHASH thrpt 361550764.360 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_UNSAFE_XXH3 thrpt 362517217.608 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_VH_XXH3 thrpt 308650535.422 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_UNSAFE_FARMHASH thrpt 414706582.955 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_VH_FARMHASH thrpt 362075878.766 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_UNSAFE_WYHASH thrpt 319224865.838 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_VH_WYHASH thrpt 251163875.798 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 HASH4J_WYHASH thrpt 280432773.588 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 HASH4J_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_UNSAFE_METROHASH thrpt 402462682.598 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 OPENHFT_VH_METROHASH thrpt 402794095.943 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1 HASH4J_KOMIHASH thrpt 308433387.242 ops/s HasherBenchmark.hash:·gc.alloc.rate 1 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1 HASH4J_KOMIHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 1 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 BASELINE_MURMUR3 thrpt 165928443.039 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 BASELINE_MURMUR3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 10 LZ4_XXHASH thrpt 98241270.644 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 LZ4_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_UNSAFE_XXHASH thrpt 222250745.208 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_VH_XXHASH thrpt 201486146.939 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_UNSAFE_XXH3 thrpt 308233732.333 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_VH_XXH3 thrpt 296493552.032 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_UNSAFE_FARMHASH thrpt 362454196.630 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_VH_FARMHASH thrpt 335258629.956 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_UNSAFE_WYHASH thrpt 280608602.911 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_VH_WYHASH thrpt 251031410.701 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 HASH4J_WYHASH thrpt 198398155.527 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 HASH4J_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_UNSAFE_METROHASH thrpt 363568879.806 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 OPENHFT_VH_METROHASH thrpt 366733256.941 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 10 HASH4J_KOMIHASH thrpt 212411823.010 ops/s HasherBenchmark.hash:·gc.alloc.rate 10 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10 HASH4J_KOMIHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 10 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 BASELINE_MURMUR3 thrpt 24412760.198 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 BASELINE_MURMUR3 thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 100 LZ4_XXHASH thrpt 13923693.388 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 LZ4_XXHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_UNSAFE_XXHASH thrpt 78942271.072 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_VH_XXHASH thrpt 57975705.231 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_UNSAFE_XXH3 thrpt 74353793.399 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_VH_XXH3 thrpt 28697894.516 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_UNSAFE_FARMHASH thrpt 73226144.398 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_VH_FARMHASH thrpt 44348674.229 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_UNSAFE_WYHASH thrpt 74397608.690 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_VH_WYHASH thrpt 61213495.155 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 HASH4J_WYHASH thrpt 62217422.870 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 HASH4J_WYHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_UNSAFE_METROHASH thrpt 80774927.209 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁶ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 OPENHFT_VH_METROHASH thrpt 59612073.855 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 100 HASH4J_KOMIHASH thrpt 62388568.318 ops/s HasherBenchmark.hash:·gc.alloc.rate 100 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100 HASH4J_KOMIHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 100 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 BASELINE_MURMUR3 thrpt 2641530.951 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 1000 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 1000 LZ4_XXHASH thrpt 1950350.868 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 LZ4_XXHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 1000 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_UNSAFE_XXHASH thrpt 13797005.908 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_VH_XXHASH thrpt 10322823.622 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_UNSAFE_XXH3 thrpt 6864043.270 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_VH_XXH3 thrpt 3514982.212 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_UNSAFE_FARMHASH thrpt 10389486.389 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_VH_FARMHASH thrpt 8540986.010 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_UNSAFE_WYHASH thrpt 8766748.497 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_VH_WYHASH thrpt 6183439.653 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 HASH4J_WYHASH thrpt 9216714.569 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 HASH4J_WYHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_UNSAFE_METROHASH thrpt 13470756.395 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 OPENHFT_VH_METROHASH thrpt 9974233.797 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000 HASH4J_KOMIHASH thrpt 7876511.726 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁵ B/op HasherBenchmark.hash:·gc.count 1000 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 BASELINE_MURMUR3 thrpt 264342.060 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 BASELINE_MURMUR3 thrpt 0.001 B/op HasherBenchmark.hash:·gc.count 10000 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 10000 LZ4_XXHASH thrpt 192029.154 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 LZ4_XXHASH thrpt 0.001 B/op HasherBenchmark.hash:·gc.count 10000 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_UNSAFE_XXHASH thrpt 1467219.028 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_VH_XXHASH thrpt 1114350.191 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_UNSAFE_XXH3 thrpt 742912.428 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_VH_XXH3 thrpt 418142.305 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_VH_XXH3 thrpt 0.001 B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_UNSAFE_FARMHASH thrpt 1093804.089 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_VH_FARMHASH thrpt 907504.667 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_UNSAFE_WYHASH thrpt 900050.555 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_VH_WYHASH thrpt 628603.646 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻³ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 HASH4J_WYHASH thrpt 1176294.898 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 HASH4J_WYHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_UNSAFE_METROHASH thrpt 1450232.278 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 OPENHFT_VH_METROHASH thrpt 1091811.359 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 10000 HASH4J_KOMIHASH thrpt 814040.287 ops/s HasherBenchmark.hash:·gc.alloc.rate 10000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 10000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ B/op HasherBenchmark.hash:·gc.count 10000 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 BASELINE_MURMUR3 thrpt 26407.129 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 BASELINE_MURMUR3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 BASELINE_MURMUR3 thrpt 0.008 B/op HasherBenchmark.hash:·gc.count 100000 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 100000 LZ4_XXHASH thrpt 20039.520 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 LZ4_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 LZ4_XXHASH thrpt 0.011 B/op HasherBenchmark.hash:·gc.count 100000 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_UNSAFE_XXHASH thrpt 147137.201 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_UNSAFE_XXHASH thrpt 0.001 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_VH_XXHASH thrpt 91316.844 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_VH_XXHASH thrpt 0.002 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_UNSAFE_XXH3 thrpt 75025.788 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_UNSAFE_XXH3 thrpt 0.004 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_VH_XXH3 thrpt 42235.044 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_VH_XXH3 thrpt 0.006 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_UNSAFE_FARMHASH thrpt 109000.657 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_UNSAFE_FARMHASH thrpt 0.002 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_VH_FARMHASH thrpt 80675.458 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_VH_FARMHASH thrpt 0.003 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_UNSAFE_WYHASH thrpt 90445.165 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_UNSAFE_WYHASH thrpt 0.003 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_VH_WYHASH thrpt 61120.201 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_VH_WYHASH thrpt 0.004 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 HASH4J_WYHASH thrpt 117452.179 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 HASH4J_WYHASH thrpt 0.002 B/op HasherBenchmark.hash:·gc.count 100000 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_UNSAFE_METROHASH thrpt 141840.739 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_UNSAFE_METROHASH thrpt 0.002 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 OPENHFT_VH_METROHASH thrpt 106486.878 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 OPENHFT_VH_METROHASH thrpt 0.002 B/op HasherBenchmark.hash:·gc.count 100000 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 100000 HASH4J_KOMIHASH thrpt 81106.395 ops/s HasherBenchmark.hash:·gc.alloc.rate 100000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 100000 HASH4J_KOMIHASH thrpt 0.003 B/op HasherBenchmark.hash:·gc.count 100000 HASH4J_KOMIHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 BASELINE_MURMUR3 thrpt 2621.827 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 BASELINE_MURMUR3 thrpt 0.014 MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 BASELINE_MURMUR3 thrpt 6.987 B/op HasherBenchmark.hash:·gc.count 1000000 BASELINE_MURMUR3 thrpt ≈ 0 counts HasherBenchmark.hash 1000000 LZ4_XXHASH thrpt 1959.188 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 LZ4_XXHASH thrpt 0.014 MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 LZ4_XXHASH thrpt 9.350 B/op HasherBenchmark.hash:·gc.count 1000000 LZ4_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_UNSAFE_XXHASH thrpt 13881.544 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_UNSAFE_XXHASH thrpt 0.016 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_UNSAFE_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_VH_XXHASH thrpt 10569.369 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_VH_XXHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_VH_XXHASH thrpt 0.021 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_VH_XXHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_UNSAFE_XXH3 thrpt 7270.184 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_UNSAFE_XXH3 thrpt 0.030 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_UNSAFE_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_VH_XXH3 thrpt 4248.382 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_VH_XXH3 thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_VH_XXH3 thrpt 0.052 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_VH_XXH3 thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_UNSAFE_FARMHASH thrpt 10363.610 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_UNSAFE_FARMHASH thrpt 0.021 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_UNSAFE_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_VH_FARMHASH thrpt 8637.606 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_VH_FARMHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_VH_FARMHASH thrpt 0.025 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_VH_FARMHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_UNSAFE_WYHASH thrpt 8749.859 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_UNSAFE_WYHASH thrpt 0.025 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_UNSAFE_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_VH_WYHASH thrpt 6069.568 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_VH_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_VH_WYHASH thrpt 0.036 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_VH_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 HASH4J_WYHASH thrpt 11403.889 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 HASH4J_WYHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 HASH4J_WYHASH thrpt 0.019 B/op HasherBenchmark.hash:·gc.count 1000000 HASH4J_WYHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_UNSAFE_METROHASH thrpt 13705.780 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_UNSAFE_METROHASH thrpt 0.016 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_UNSAFE_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 OPENHFT_VH_METROHASH thrpt 10280.255 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 OPENHFT_VH_METROHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 OPENHFT_VH_METROHASH thrpt 0.021 B/op HasherBenchmark.hash:·gc.count 1000000 OPENHFT_VH_METROHASH thrpt ≈ 0 counts HasherBenchmark.hash 1000000 HASH4J_KOMIHASH thrpt 7807.204 ops/s HasherBenchmark.hash:·gc.alloc.rate 1000000 HASH4J_KOMIHASH thrpt ≈ 10⁻⁴ MB/sec HasherBenchmark.hash:·gc.alloc.rate.norm 1000000 HASH4J_KOMIHASH thrpt 0.028 B/op HasherBenchmark.hash:·gc.count 1000000 HASH4J_KOMIHASH thrpt ≈ 0 counts ```

backslasht commented 9 months ago

Thanks @ketanv3 for the detailed benchmarking exercise, this is very insightful.

Overall, I see FarmHash, Wyhash, and Komihash having similar performance and heap allocation.

I'm leaning towards Komihash for reasons below:

Based on the numbers in https://github.com/avaneev/komihash
It is under active development which indicates the performance may get better with time.

reta commented 9 months ago

I'm leaning towards Komihash for reasons below:

Side with @backslasht here, thanks @ketanv3

ketanv3 commented 9 months ago

@backslasht @reta I'm more inclined towards FarmHash for the following reasons:

Usually, the published performance results are misleading; they talk about the raw performance on large inputs (digests) but not about the performance on very small inputs (hash table keys). Each hash function has a startup cost (irrespective of the input size) and an incremental cost (proportional to the input size). For use in a hash table, we want a hash function that has a tiny startup cost, an acceptable incremental cost, and an instruction cache-friendly code. One way to gauge is to look out for cycles/hash on small keys.
Another big factor is the language itself; komihash for example, requires an unsigned multiplication of two 64-bit numbers. Since Java doesn't support unsigned longs, we end up performing a chain of operations to get the desired result, which could be more expensive than other languages which support it natively. For this reason, the fastest C++ hash function may not be the fastest in Java (ignoring the platform differences).

There's another fast hash function out there: https://github.com/erthink/t1ha. Unfortunately, there's no Java port available. Seems like it too depends on fast unsigned multiplication of two longs (https://github.com/OpenHFT/Zero-Allocation-Hashing/issues/23).

FYI, I'm running benchmarks with the short-listed hash functions when used in a hash table. It should holistically answer for both quality and performance. Early results favour FarmHash, but I'll let it complete fully. :)

ketanv3 commented 9 months ago

Hi @reta @backslasht, here's the performance within the hash table.

I ran the experiments with a subset of arbitrarily chosen key lengths with growing table sizes. As expected, since all hash functions have an avalanche effect, the impact of collisions was negligible and the performance was mainly driven by the performance of the hash function.

Short keys

For keys less than 30 bytes, FarmHash has the best performance.

Medium keys

For keys between 30 and 100 bytes, FarmHash performs better on small tables but wyhash performs better on large tables.

Large keys

For keys larger than 100 bytes, wyhash has the best performance followed by komihash. The difference was significant compared to FarmHash.

Summary

I'm impressed by the performance of wyhash/komihash on large inputs, and I expect their performance to get better on small inputs once we enforce the minimum version to JDK 18+ where unsignedMultiplyHigh can be replaced with Math.unsignedMultiplyHigh which has x86_64 and aarch64 intrinsics.

backslasht commented 9 months ago

Thanks @ketanv3 for the experiment.

Based on the results, it is clear that farmhash is out of the options. Between komihash and wyhash, I'm still leaning towards komihash but open for wyhash as well. I'll let you decide between them.

ketanv3 commented 9 months ago

@reta @backslasht

Out of curiosity, I decided to write my own Java port of t1ha hash function. The results are shockingly good!

Raw performance

It's faster all around—faster than FarmHash on small inputs and wyhash/komihash/xxh3/MetroHash on long inputs. Unlike FarmHash where performance drops in steps, here it's more gradual as the input length increases.

Allocations

Here's a summary of "gc.alloc.rate.norm" at different input lengths. One of the best again.

Key length (bytes)	1	10	100	1000	10,000	100,000	1,000,000
T1HA	≈ 10⁻⁶ B/op	≈ 10⁻⁶ B/op	10⁻⁶ B/op	≈ 10⁻⁵ B/op	≈ 10⁻⁴ B/op	0.002 B/op	0.015 B/op

Hash table performance

Performance at arbitrarily chosen key lengths with growing table sizes. t1ha performed best in all tests.

Caveats

The above results seem too good to be true; so are there any trade-offs? Yes. To overcome language and performance limitations, my implementation differs slightly from the reference implementation in C++, so the returned values will vary.

But does it matter? Probably not. For our use-case, all we need a fast hash function that achieves avalanche. Here's how my implementation compares to other well known hash functions:

Avalanche statistics on 50-byte random input

For context, smhasher considers a hash function to pass the avalanche test if the bias is less than 1% (here). It can be seen how poorly the inbuilt Arrays.hashCode is defined.		Sum of squared errors	Bias
XXH3	0.64	0.0065%	99.9935%
FarmHash	0.87	0.0052%	99.9948%
wyhash	0.64	0.0015%	99.9985%
komihash	0.65	0.0028%	99.9972%
t1ha	0.64	0.0056%	99.9944%
Arrays.hashCode	2852.51	16.2308%	83.7692%

Avalanche diagram

Here's a comparison between Arrays.hashCode and t1ha's avalanche diagram. It shows the probability of an output bit flipping (y-axis) when a single input bit (x-axis) is flipped. Ideally, we want to be close to 0.5 to achieve avalanche.

Repeated the experiment with 24, 32, 40, 48, 56, 64, 72, 80, 96, 112, 128, 160, 512, 1024-bit keys (as defined in smhasher) and was able to achieve avalanche every time. Let's roll this out?

backslasht commented 9 months ago

@ketanv3 - The results looks great! Any considerations on the correctness and maintainability of the java port?

ketanv3 commented 9 months ago

Any considerations on the correctness and maintainability of the java port?

I don't expect the returned values to match the reference implementation in C++ due to slight differences in my implementation to overcome the language limitations. We'll be explicit about these caveats in the implementation so it's not mistaken for the original implementation.

We'll have tests to verify the quality of the returned values, using sum of squared errors and bias metrics. It's worth pointing out that, in a hash table, improvements to a hash function's quality has diminishing returns. A 64-bit hash value will be modded with the table size (which is usually pretty small), and birthday paradox ensures that collisions happen no matter how good the hash function is, as long as it's not completely flawed.

Update: I have verified the correctness using the tests and expected results defined in the reference implementation.

reta commented 9 months ago

I don't expect the returned values to match the reference implementation in C++ due to slight differences in my implementation to overcome the language limitations. We'll be explicit about these caveats in the implementation so it's not mistaken for the original implementation.

I feel uneasy about this statement: definitely not an expert on hash functions but the implementors spend a lot of time on developing those, so I would feel much more comfortable when ours and reference implementation matches (even when charts look good)

I'm impressed by the performance of wyhash/komihash on large inputs, and I expect their performance to get better on small inputs once we enforce the minimum version to JDK 18+ where unsignedMultiplyHigh can be replaced with Math.unsignedMultiplyHigh which has https://github.com/openjdk/jdk/pull/5933 and https://github.com/openjdk/jdk/pull/8840 intrinsics.

Does your t1ha implementation rely on that as well? What JDK version we use to run these tests?

ketanv3 commented 9 months ago

@reta I understand your concern; I too would have preferred to match the reference implementation if our use-case was fingerprinting where we would have used all 64 bits of the hash value. But in a hash table, where the hash value is modded with the table size, we lose a lot of entropy anyways and collisions are unavoidable (for dynamic workloads). Creating a perfect hash function is impractical and resolving collisions in an open-addressed table is often cheaper.

Here's the explained by experts from this field:

https://news.ycombinator.com/item?id=19359504
https://news.ycombinator.com/item?id=19358799 (answered by rurban who maintains the smhasher fork).

The only requirement is that the hash function isn't fundamentally flawed; which I have verified using the avalanche tests. I have also included them as part of unit-tests.

Yes, t1ha also requires unsignedMultiplyHigh and that's where my implementation differs. I'll push my code, that'll make things more clear. The results I've shared on this PR are using OpenJDK 17.

Here's the benchmark when matching the output of the reference implementation. It's slightly slower but still faster than all other hash functions, albeit slower than FarmHash on some small inputs. Collision-wise, it didn't make much difference so it was slower when used in a hash table.

reta commented 9 months ago

Yes, t1ha also requires unsignedMultiplyHigh and that's where my implementation differs. I'll push my code, that'll make things more clear. The results I've shared on this PR are using OpenJDK 17.

:+1. thanks @ketanv3

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: :x:
URL: https://build.ci.opensearch.org/job/gradle-check/22658/
CommitID: 21d5894a66515c3d23f47fb8710ee304bcaf7980 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:


> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]
Compatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git]

BUILD SUCCESSFUL in 31m 1s

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:


> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/reporting.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 32m 52s

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE :grey_exclamation:

TEST FAILURES:

  1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testDropPrimaryDuringReplication

URL: https://build.ci.opensearch.org/job/gradle-check/22667/
CommitID: 0e00f0b8aa79cf233c381360e54efebc1443775f Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 4871e25

Incompatible components

Incompatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/22869/
CommitID: 156d0d915937fcbe0cd524a08ac76072125187b2 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 4871e25

Incompatible components

Incompatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/22871/
CommitID: bc4fdd23987c16df9908838a03e1557387e92ad9 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

ketanv3 commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE ❌

URL: https://build.ci.opensearch.org/job/gradle-check/22871/

CommitID: bc4fdd2 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 4871e25

Incompatible components

Incompatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/22872/
CommitID: 1b696ff2c447b5f18961eb887a5775979a4bf833 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 8d153ab

Incompatible components

Incompatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: FAILURE :x:
URL: https://build.ci.opensearch.org/job/gradle-check/22895/
CommitID: 337ea516ee26b8a4ae438ccb6bd87abd79c0cb47 Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

ketanv3 commented 9 months ago

Previous check failed as #9389 was just merged to the main branch with updated version checks. Such commits break BWC for any other PRs not having the latest changes in the main branch. Rebasing and retrying.

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: UNSTABLE :grey_exclamation:

TEST FAILURES:

  1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testIndexDeletionDuringSnapshotCreationInQueue

URL: https://build.ci.opensearch.org/job/gradle-check/22898/
CommitID: 4086b17d543aaa229b6d3bc715f0c5e24d86776d Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 8d153ab

Incompatible components

Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/asynchronous-search.git]

Skipped components

Compatible components

opensearch-trigger-bot[bot] commented 9 months ago

Compatibility status:

Checks if related components are compatible with change 8255050

Incompatible components

Incompatible components: [https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/security-analytics.git]

Skipped components

Compatible components

github-actions[bot] commented 9 months ago

Gradle Check (Jenkins) Run Completed with:

RESULT: SUCCESS :white_check_mark:
URL: https://build.ci.opensearch.org/job/gradle-check/22922/
CommitID: 374db9772460b96859ab46c4dfb53b3cc8e67f2f

reta commented 9 months ago

Looks good to me. Thanks @ketanv3 for all the experiments and the PR.

Second that, I think I understood the idea behind BytesRefHash (it is too clever :D) but it looks pretty good, the numbers in particular are convincing, thanks @ketanv3

opensearch-project / OpenSearch