Closed peroxyacyl closed 2 years ago
Another benchmark on Intel Xeon Platinum 8151 CPU @ 3.40GHz
master vs seed=0
name old time/op new time/op delta
Fixed/0-2 3.19ns ± 0% 5.74ns ± 1% +80.11% (p=0.000 n=99+98)
Fixed/1-2 3.64ns ± 0% 5.58ns ± 2% +53.60% (p=0.000 n=100+99)
Fixed/2-2 3.43ns ± 0% 5.87ns ± 0% +70.99% (p=0.000 n=100+100)
Fixed/3-2 3.27ns ± 0% 5.45ns ± 0% +66.74% (p=0.000 n=100+98)
Fixed/4-2 3.88ns ± 0% 5.61ns ± 1% +44.62% (p=0.000 n=100+100)
Fixed/8-2 3.88ns ± 0% 5.61ns ± 1% +44.63% (p=0.000 n=99+100)
Fixed/9-2 3.74ns ± 0% 5.06ns ± 1% +35.25% (p=0.000 n=99+100)
Fixed/16-2 3.74ns ± 0% 5.06ns ± 1% +35.24% (p=0.000 n=97+100)
Fixed/17-2 6.27ns ± 0% 7.91ns ± 1% +26.18% (p=0.000 n=100+91)
Fixed/32-2 6.27ns ± 0% 7.91ns ± 0% +26.13% (p=0.000 n=100+84)
Fixed/33-2 9.28ns ± 0% 11.07ns ± 2% +19.23% (p=0.000 n=100+100)
Fixed/64-2 9.28ns ± 0% 11.08ns ± 1% +19.32% (p=0.000 n=100+100)
Fixed/65-2 11.1ns ± 1% 13.6ns ± 1% +22.16% (p=0.000 n=101+100)
Fixed/96-2 11.1ns ± 1% 13.6ns ± 2% +22.42% (p=0.000 n=101+100)
Fixed/97-2 13.2ns ± 0% 16.3ns ± 3% +23.24% (p=0.000 n=99+100)
Fixed/128-2 13.2ns ± 0% 16.3ns ± 4% +23.61% (p=0.000 n=98+100)
Fixed/129-2 15.3ns ± 0% 16.6ns ± 0% +8.50% (p=0.000 n=100+100)
Fixed/240-2 28.8ns ± 0% 32.3ns ± 0% +12.15% (p=0.000 n=100+100)
Fixed/241-AVX2-2 19.5ns ± 0% 21.5ns ± 0% +10.49% (p=0.000 n=100+100)
Fixed/241-SSE2-2 22.2ns ± 0% 24.4ns ± 1% +10.12% (p=0.000 n=94+100)
Fixed/241-2 44.6ns ± 1% 44.6ns ± 0% -0.13% (p=0.002 n=99+99)
Fixed/512-AVX2-2 24.0ns ± 0% 26.1ns ± 0% +8.75% (p=0.000 n=100+100)
Fixed/512-SSE2-2 31.3ns ± 0% 34.3ns ± 1% +9.58% (p=0.000 n=100+100)
Fixed/512-2 71.6ns ± 0% 75.3ns ± 0% +5.13% (p=0.000 n=99+93)
Fixed/1024-AVX2-2 35.1ns ± 0% 37.7ns ± 0% +7.28% (p=0.000 n=92+99)
Fixed/1024-SSE2-2 50.2ns ± 0% 53.3ns ± 1% +6.15% (p=0.000 n=100+99)
Fixed/1024-2 133ns ± 0% 137ns ± 0% +3.01% (p=0.000 n=100+100)
Fixed/8192-AVX2-2 190ns ± 0% 192ns ± 0% +1.29% (p=0.000 n=99+99)
Fixed/8192-SSE2-2 337ns ± 0% 339ns ± 0% +0.59% (p=0.000 n=100+100)
Fixed/8192-2 1.04µs ± 0% 1.28µs ± 0% +23.20% (p=0.000 n=87+75)
Fixed/102400-AVX2-2 2.33µs ± 0% 2.33µs ± 0% +0.16% (p=0.000 n=100+100)
Fixed/102400-SSE2-2 4.30µs ± 0% 4.30µs ± 0% +0.08% (p=0.000 n=99+100)
Fixed/102400-2 12.9µs ± 0% 16.2µs ± 0% +25.83% (p=0.000 n=96+100)
master vs seed=42
name old time/op new time/op delta
Fixed/0-2 3.19ns ± 0% 5.26ns ± 0% +64.96% (p=0.000 n=99+97)
Fixed/1-2 3.64ns ± 0% 5.14ns ± 6% +41.52% (p=0.000 n=100+100)
Fixed/2-2 3.43ns ± 0% 4.84ns ± 0% +40.82% (p=0.000 n=100+98)
Fixed/3-2 3.27ns ± 0% 4.78ns ± 2% +46.20% (p=0.000 n=100+100)
Fixed/4-2 3.88ns ± 0% 5.52ns ± 1% +42.15% (p=0.000 n=100+100)
Fixed/8-2 3.88ns ± 0% 5.52ns ± 1% +42.14% (p=0.000 n=99+100)
Fixed/9-2 3.74ns ± 0% 5.06ns ± 1% +35.17% (p=0.000 n=99+100)
Fixed/16-2 3.74ns ± 0% 5.06ns ± 1% +35.13% (p=0.000 n=97+100)
Fixed/17-2 6.27ns ± 0% 7.48ns ± 0% +19.38% (p=0.000 n=100+95)
Fixed/32-2 6.27ns ± 0% 7.48ns ± 0% +19.36% (p=0.000 n=100+95)
Fixed/33-2 9.28ns ± 0% 10.30ns ± 0% +10.94% (p=0.000 n=100+83)
Fixed/64-2 9.28ns ± 0% 10.33ns ± 1% +11.29% (p=0.000 n=100+100)
Fixed/65-2 11.1ns ± 1% 12.6ns ± 1% +13.42% (p=0.000 n=101+100)
Fixed/96-2 11.1ns ± 1% 12.6ns ± 0% +13.11% (p=0.000 n=101+85)
Fixed/97-2 13.2ns ± 0% 14.6ns ± 1% +10.98% (p=0.000 n=99+100)
Fixed/128-2 13.2ns ± 0% 14.7ns ± 0% +11.02% (p=0.000 n=98+100)
Fixed/129-2 15.3ns ± 0% 16.8ns ± 0% +9.80% (p=0.000 n=100+100)
Fixed/240-2 28.8ns ± 0% 31.8ns ± 0% +10.27% (p=0.000 n=100+100)
Fixed/241-AVX2-2 19.5ns ± 0% 45.0ns ± 0% +130.77% (p=0.000 n=100+98)
Fixed/241-SSE2-2 22.2ns ± 0% 47.7ns ± 0% +114.86% (p=0.000 n=94+78)
Fixed/241-2 44.6ns ± 1% 68.4ns ± 0% +53.16% (p=0.000 n=99+99)
Fixed/512-AVX2-2 24.0ns ± 0% 49.7ns ± 0% +107.27% (p=0.000 n=100+100)
Fixed/512-SSE2-2 31.3ns ± 0% 59.7ns ± 0% +90.60% (p=0.000 n=100+99)
Fixed/512-2 71.6ns ± 0% 97.7ns ± 0% +36.41% (p=0.000 n=99+85)
Fixed/1024-AVX2-2 35.1ns ± 0% 61.4ns ± 0% +74.93% (p=0.000 n=92+89)
Fixed/1024-SSE2-2 50.2ns ± 0% 83.6ns ± 0% +66.44% (p=0.000 n=100+100)
Fixed/1024-2 133ns ± 0% 159ns ± 0% +19.55% (p=0.000 n=100+100)
Fixed/8192-AVX2-2 190ns ± 0% 216ns ± 0% +13.68% (p=0.000 n=99+100)
Fixed/8192-SSE2-2 337ns ± 0% 384ns ± 0% +13.95% (p=0.000 n=100+99)
Fixed/8192-2 1.04µs ± 0% 1.30µs ± 0% +25.22% (p=0.000 n=87+79)
Fixed/102400-AVX2-2 2.33µs ± 0% 2.37µs ± 0% +1.89% (p=0.000 n=100+100)
Fixed/102400-SSE2-2 4.30µs ± 0% 4.37µs ± 0% +1.79% (p=0.000 n=99+100)
Fixed/102400-2 12.9µs ± 0% 16.2µs ± 0% +26.00% (p=0.000 n=96+100)
@peroxyacyl I've redone the PR in #12 - duplicated the code and added it to the Hasher.
~There is no 128 bit hash support yet. Do you know where I can find the information to do hashSmall128
and hashMed128
?~
Seed support has been added and in v1.0.0 now.
As Cyan mentioned in his blog, xxh3 has initially developed for bloom filters that need changing seed for each hashing. Now I make a bloom filter in Go in my project so I have written this patch.
However, the performance gets slower due to:
I'm making this PR just for saying thanks for your work. It's another option to host separate codes both high performance and seed support.
Here are benchmarks on my Intel Core i7-4790K CPU @ 4.00GHz.
master vs seed = 0
master vs seed = 42