Open p5pRT opened 9 years ago
This is a bug report for perl from rurban@cpanel.net\, generated with the help of perlbug 1.40 running under perl 5.21.9.
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags:
-march=native or -mcrc32 or -msse42
also with untested AARCH64_FL_CRC support
https://gcc.gnu.org/ml/gcc-patches/2014-01/msg01012.html
See https://github.com/rurban/perl-hash-stats#perl-hash-stats for the
analysis. iSCSI CRC32-C is by far the fastest hash function tested\, and
produces on average the least collisions\, so is also the best for the
average case.
collisions cycles/hash
CRC32 1.066 29.78
ONE_AT_A_TIME_HARD 1.092 83.75
See https://github.com/rurban/smhasher#smhasher for a more detailled
analysis on the hash function qualities. 'doc/crc32_hw' is by far better
than the current 'doc/JenkinsOOAT'
Worst-case:
You cannot trust any of these fast hash functions for security at all.
Collision complexity attacks need to be secured in other ways. It's trivial
to generate such attacks without our randomized hash seeds and iterators on
all these hash functions. Without those measures CRC32 is even more trivial
to attack than siphash\, city\, jenkins or murmur.
Eventually we will need to switch to a better hash table scheme\, which will
be faster (more cache-friendly)\, has better key checks\, and avoids
complexity attacks against our linear linked lists at all. Sorted buckets\,
double-hashing\, universal hashing or adding a sleep with >10 collisions are
all better schemes that I am working on.
To whatever committer will review this\, there are also a couple more hash algos\, including hardware AES hash on an abandoned/stalled branch http://perl5.git.perl.org/perl.git/shortlog/refs/heads/yves/hash_wip_lastet specifically commit http://perl5.git.perl.org/perl.git/commitdiff/694bbdcff696678ca9df69c15dea62a6e6e0df2a Notice rurban's patch is hardware CRC hash. Hardware CRC and hardware AES did not come out together in Intel CPUs\, so the wont share the exact same CPP macros controlling their usage\, but there might need to be some thought given (maybe in Configure\, IDK)\, on how to support hardware specific code in the Perl core\, and of both of these patches are using the "right" macros to control their availability. I am not expecting rurban to write such an infrastructure\, but someone else (maybe a committer) wants to\, they should.
-- bulk88 ~ bulk88 at hotmail.com
The RT System itself - Status changed from 'new' to 'open'
On 2nd thought I wont feel comfortable enough to have this now enabled as default. Just make it available as selectable option.
I'm rewriting the hash tables soon\, and will not need HW CRC32-C. I'd rather prefer glibc double hashing (hsearch) with their short hashers\, or even universal hashers for maximum security (ie. 2 random hash functions).
And I haven't worked on security with the new hash table using CRC32-C\, which is known to be easily attacked.
On 16 March 2015 at 23:38\, bulk88 via RT \perlbug\-followup@​perl\.org wrote:
but there might need to be some thought given (maybe in Configure\, IDK)\, on how to support hardware specific code in the Perl core
+1. Hardware specific codepaths indeed need to be configurable and not 100% automagical\, for crossdev scenarios mostly.
If it can be expressed in terms of a specific CPUFlag oriented toggle\, not a feature specific toggle\, then that would be desirable.
--with-cpu-avx
For instance.
-- Kent
*KENTNL* - https://metacpan.org/author/KENTNL
On 03/17/2015 06:04 PM\, Kent Fredric via RT wrote:
On 16 March 2015 at 23:38\, bulk88 via RT \perlbug\-followup@​perl\.org wrote:
but there might need to be some thought given (maybe in Configure\, IDK)\, on how to support hardware specific code in the Perl core
+1. Hardware specific codepaths indeed need to be configurable and not 100% automagical\, for crossdev scenarios mostly.
If it can be expressed in terms of a specific CPUFlag oriented toggle\, not a feature specific toggle\, then that would be desirable.
--with-cpu-avx
IMHO not needed. The current hardware specific toggle to check for __SSE4_2__ at compile-time would be the same as a hardware specific check in Configure.
The only theoretical problem is with packagers who compile i386 with -mcrc32 or -msse42\, and then when the target machine is a real i386\, and not a i586 it will fail. But this is entirely a packaging bug we cannot prevent.
For cross-compilation it is current practice that the target config.sh is properly filled out.
A run-time check for sse4.2/aarch64 cpu features is only needed in a jit.
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
Tony
On 03/30/2015 02:37 AM\, Tony Cook via RT wrote:
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
Yes\, this was only there for debugging to catch the common zero-collapse problem. It should be #ifdef DEBUGGING
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
hash seed length? This is a global for all hash functions. What do you mean?
Having calculated the statistical average case your sentiment is wrong. It is not only the fastest\, it is also statistically the best.
You can argue about the security by known attacks on the CRC scheme. Talking about security via hash functions only\, only the slower siphash makes sense.
On Fri Apr 03 02:50:43 2015\, rurban@cpanel.net wrote:
On 03/30/2015 02:37 AM\, Tony Cook via RT wrote:
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
Yes\, this was only there for debugging to catch the common zero- collapse problem. It should be #ifdef DEBUGGING
No\, the problem is that (U32)(seed+len) will be zero in 1 in 2**32 cases[1] (consider a seed of 0xFFFFFF0 and len = 0x10)\, so the assert() shouldn't be there at all.
assert() is only tested under -DDEBUGGING by default anyway.
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
hash seed length? This is a global for all hash functions. What do you mean?
Having calculated the statistical average case your sentiment is wrong. It is not only the fastest\, it is also statistically the best.
You can argue about the security by known attacks on the CRC scheme. Talking about security via hash functions only\, only the slower siphash makes sense.
The problem is\, even with hash order randomization\, I expect a 32-bit seed to be fairly cheap to brute-force[2]
I'm not claiming that ONE_AT_A_TIME_HARD is crytographically strong\, but the 64-bit seed does provide some protection against more casual attackers.
I'd have no problem accepting a patch that added CRC32 hashes as an option\, which will make it simpler for you to build your own perls with that hash.
I won't accept a patch that makes CRC32 the default hash algorithm.
Tony
[1] complicated by the distribution of len values\, which tend to be short\, but seed should be fairly evenly distrubuted.
[2] the original 32-bit seeds were easily brute-forced on an Intel Q9660.
On Sun Aug 02 17:56:25 2015\, tonyc wrote:
On Fri Apr 03 02:50:43 2015\, rurban@cpanel.net wrote:
On 03/30/2015 02:37 AM\, Tony Cook via RT wrote:
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
Yes\, this was only there for debugging to catch the common zero- collapse problem. It should be #ifdef DEBUGGING
No\, the problem is that (U32)(seed+len) will be zero in 1 in 2**32 cases[1] (consider a seed of 0xFFFFFF0 and len = 0x10)\, so the assert() shouldn't be there at all.
I preferred zero hashes to assert\, as they are invariant against \0 attacks. As you found out\, this assert is only during debugging.
assert() is only tested under -DDEBUGGING by default anyway.
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
hash seed length? This is a global for all hash functions. What do you mean?
Having calculated the statistical average case your sentiment is wrong. It is not only the fastest\, it is also statistically the best.
You can argue about the security by known attacks on the CRC scheme. Talking about security via hash functions only\, only the slower siphash makes sense.
The problem is\, even with hash order randomization\, I expect a 32-bit seed to be fairly cheap to brute-force[2]
I wrote a brute-forcer\, and the faster the hash\, the faster the brute-forcer. 32-bit is done \<4s.
I'm not claiming that ONE_AT_A_TIME_HARD is crytographically strong\, but the 64-bit seed does provide some protection against more casual attackers.
I see. 64bit seeds are trivial to add to all hash funcs.
I'd have no problem accepting a patch that added CRC32 hashes as an option\, which will make it simpler for you to build your own perls with that hash.
I made it much simplier for me to use different hashes with a Configure -Dhash_func=name option\, so this argument is bogus.
But I switched to FNV1A now\, which makes the fastest HT\, and doesn't have the security problems of CRC. Explained at https://github.com/rurban/smhasher and https://github.com/rurban/perl-hash-stats This patch was written a year ago\, and is long outdated since.
I won't accept a patch that makes CRC32 the default hash algorithm.
Because? It creates the best distribution? It is the fastest? You need to work on your argumentation a bit. You are still using one of the worst default hash functions of all tested. Slow *and* bad. Only DJB2 and OOAT is worse\, but 8 others are better.
[1] complicated by the distribution of len values\, which tend to be short\, but seed should be fairly evenly distrubuted.
[2] the original 32-bit seeds were easily brute-forced on an Intel Q9660.
Exactly. That's why the whole security theatre is bogus. -- Reini Urban
On Tue Aug 25 07:41:42 2015\, rurban wrote:
On Sun Aug 02 17:56:25 2015\, tonyc wrote:
On Fri Apr 03 02:50:43 2015\, rurban@cpanel.net wrote:
On 03/30/2015 02:37 AM\, Tony Cook via RT wrote:
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
Yes\, this was only there for debugging to catch the common zero- collapse problem. It should be #ifdef DEBUGGING
No\, the problem is that (U32)(seed+len) will be zero in 1 in 2**32 cases[1] (consider a seed of 0xFFFFFF0 and len = 0x10)\, so the assert() shouldn't be there at all.
I preferred zero hashes to assert\, as they are invariant against \0 attacks. As you found out\, this assert is only during debugging.
asserts should only trigger when the code is in error. Since a zero (seed+len) can occur during normal operation the asserts are inappropriate.
assert() is only tested under -DDEBUGGING by default anyway.
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
hash seed length? This is a global for all hash functions. What do you mean?
Having calculated the statistical average case your sentiment is wrong. It is not only the fastest\, it is also statistically the best.
You can argue about the security by known attacks on the CRC scheme. Talking about security via hash functions only\, only the slower siphash makes sense.
The problem is\, even with hash order randomization\, I expect a 32-bit seed to be fairly cheap to brute-force[2]
I wrote a brute-forcer\, and the faster the hash\, the faster the brute- forcer. 32-bit is done \<4s.
I'm not claiming that ONE_AT_A_TIME_HARD is crytographically strong\, but the 64-bit seed does provide some protection against more casual attackers.
I see. 64bit seeds are trivial to add to all hash funcs.
I'd have no problem accepting a patch that added CRC32 hashes as an option\, which will make it simpler for you to build your own perls with that hash.
I made it much simplier for me to use different hashes with a Configure -Dhash_func=name option\, so this argument is bogus.
But I switched to FNV1A now\, which makes the fastest HT\, and doesn't have the security problems of CRC. Explained at https://github.com/rurban/smhasher and https://github.com/rurban/perl-hash-stats This patch was written a year ago\, and is long outdated since.
I won't accept a patch that makes CRC32 the default hash algorithm.
Because? It creates the best distribution? It is the fastest? You need to work on your argumentation a bit. You are still using one of the worst default hash functions of all tested. Slow *and* bad. Only DJB2 and OOAT is worse\, but 8 others are better.
Because it has only a 32-bit seed space\, which makes it trivial to brute-force\, as you confirmed above.
[1] complicated by the distribution of len values\, which tend to be short\, but seed should be fairly evenly distrubuted.
[2] the original 32-bit seeds were easily brute-forced on an Intel Q9660.
Exactly. That's why the whole security theatre is bogus.
We use a 64-bit seed to make such brute-forcing harder (not impossible).
Tony
On 26 August 2015 at 01:58\, Tony Cook via RT \perlbug\-followup@​perl\.org wrote:
On Tue Aug 25 07:41:42 2015\, rurban wrote:
On Sun Aug 02 17:56:25 2015\, tonyc wrote:
On Fri Apr 03 02:50:43 2015\, rurban@cpanel.net wrote:
On 03/30/2015 02:37 AM\, Tony Cook via RT wrote:
On Sat Mar 14 03:50:13 2015\, rurban@cpanel.net wrote:
perl_hash_crc32: use HW hasher as default with SSE4.2
can be enabled with the following ccflags: -march=native or -mcrc32 or -msse42
I think these assertions:
@@ -504\,6 +524\,7 @@ PERL_STATIC_INLINE U32 S_perl_hash_one_at_a_time(const unsigned char * const seed\, const unsigned char *str\, const STRLEN len) { const unsigned char * const end = (const unsigned char *)str + len; U32 hash = *((U32*)seed) + (U32)len; + assert(hash); while (str \< end) { hash += *str++; hash += (hash \<\< 10);
are just plain wrong.
What happens when (U32)(*seed + len) just happen to be zero?
Yes\, this was only there for debugging to catch the common zero- collapse problem. It should be #ifdef DEBUGGING
No\, the problem is that (U32)(seed+len) will be zero in 1 in 2**32 cases[1] (consider a seed of 0xFFFFFF0 and len = 0x10)\, so the assert() shouldn't be there at all.
I preferred zero hashes to assert\, as they are invariant against \0 attacks. As you found out\, this assert is only during debugging.
asserts should only trigger when the code is in error. Since a zero (seed+len) can occur during normal operation the asserts are inappropriate.
assert() is only tested under -DDEBUGGING by default anyway.
I don't think crc32 has enough hash seed length to be the default hashing algorithm.
hash seed length? This is a global for all hash functions. What do you mean?
Having calculated the statistical average case your sentiment is wrong. It is not only the fastest\, it is also statistically the best.
You can argue about the security by known attacks on the CRC scheme. Talking about security via hash functions only\, only the slower siphash makes sense.
The problem is\, even with hash order randomization\, I expect a 32-bit seed to be fairly cheap to brute-force[2]
I wrote a brute-forcer\, and the faster the hash\, the faster the brute- forcer. 32-bit is done \<4s.
I'm not claiming that ONE_AT_A_TIME_HARD is crytographically strong\, but the 64-bit seed does provide some protection against more casual attackers.
I see. 64bit seeds are trivial to add to all hash funcs.
I'd have no problem accepting a patch that added CRC32 hashes as an option\, which will make it simpler for you to build your own perls with that hash.
I made it much simplier for me to use different hashes with a Configure -Dhash_func=name option\, so this argument is bogus.
But I switched to FNV1A now\, which makes the fastest HT\, and doesn't have the security problems of CRC. Explained at https://github.com/rurban/smhasher and https://github.com/rurban/perl-hash-stats This patch was written a year ago\, and is long outdated since.
I won't accept a patch that makes CRC32 the default hash algorithm.
Because? It creates the best distribution? It is the fastest? You need to work on your argumentation a bit. You are still using one of the worst default hash functions of all tested. Slow *and* bad. Only DJB2 and OOAT is worse\, but 8 others are better.
Because it has only a 32-bit seed space\, which makes it trivial to brute-force\, as you confirmed above.
[1] complicated by the distribution of len values\, which tend to be short\, but seed should be fairly evenly distrubuted.
[2] the original 32-bit seeds were easily brute-forced on an Intel Q9660.
Exactly. That's why the whole security theatre is bogus.
We use a 64-bit seed to make such brute-forcing harder (not impossible).
It is not just that. The hash Reini was advocating has a trivial multi-collision attack. I have implemented a test for the multi-collision attack in my version of smhasher. It proves that you can break pretty much any hash based on Intel CRC intrinsics without knowing how they work\, or what seed they are using. CRC is an incredibly bad base unit of processing for hashes\, the exact properties that make it so desirable as a CRC make it incredibly undesirable as a hash. Using it via the Intel intrinsic __mm_crc32_u64() is especially irresponsible. That operation takes an 64 bit value and calculates a 32 bit CRC from it. That guarantees 2**32 possible collisions per block. You only need to look at about 160K random blocks before you find a collision. Writing the code to CRC 160k random blocks\, sorting it to find the dupes\, etc\, takes a developer like me about 10 minutes. Once you have that CRC block collision\, you take the two blocks and treat them as the bits 0 and 1\, and then expand out an integer bit for block\, and presto\, you have the ability to create an aribitrary set of collisions that will break any naive use of CRC\, which is what most will be if they are using it for speed.
So the same set of blocks breaks all of the following functions:
crc32_hw 32 32 32 FAILED 45 of 129 tests. Differential: 17-18\,23\,29\,31-33\,39\, Cyclic: 46\,48\,50\, TwoBytes: 52-54\,61\, CRC-MultiCollision: 79-88\, Lowbits: 90-91\, Highbits: 93-94\, Highbits2: 96-97\, HighBit: 99-100\, LowBit: 102-103\, Hi-Lo: 105-106\, Text: 108\,110-113\, Seed: 119-125 crc32_hw1 32 32 32 FAILED 68 of 173 tests. Differential: 17-41\, Cyclic: 48\,50\,52\, TwoBytes: 54-56\,63\, Crc-MultiCollision: 81-90\, Murmur2-MultiCollision: 94-95\,97-100\, Lowbits: 132-133\, Highbits: 135-136\, Highbits2: 138-139\, HiBit-Null: 141-142\, LowBit-Null: 144-145\, Hi-Lo: 147-148\, Text: 150-152\,154-155\, Seed: 161-162\,167 crc64_hw 64 64 64 FAILED 105 of 131 tests. Differential: 17-41\, Cyclic: 42-52\, TwoBytes: 53-63\, Sparse: 64-80\, CRC-MultiCollision: 81-90\, Lowbits: 92-93\, Highbits: 95-96\, Highbits2: 98-99\, HighBit: 101-102\, LowBit: 104-105\, Hi-Lo: 106-108\, Text: 110-112\,114-115\, Zeroes: 117-118\, Seed: 119-127\, Effs: 129-130 metrohash128crc_1 64 64 128 FAILED 20 of 129 tests. Differential: 17-24\,39\, CRC-MultiCollision: 79-88 metrohash128crc_2 64 64 128 FAILED 21 of 133 tests. Differential: 17-24\,38\,41\, CRC-MultiCollision: 81-90 metrohash64crc_1 64 64 64 FAILED 17 of 131 tests. Differential: 17\,19\,21\,41\, Cyclic: 42\,52\, CRC-MultiCollision: 81-90 metrohash64crc_2 64 64 64 FAILED 17 of 131 tests. Differential: 17\,19\,21\,41\, Cyclic: 42\,52\, CRC-MultiCollision: 81-90 t1ha_crc 64 64 64 FAILED 36 of 133 tests. Differential: 17-41\, CRC-MultiCollision: 81-90
The only hash functions that use CRC that do not get tripped up by this use CRC to perturb internal random state\, not on input data.
You can see detailed test reports\, summaries\, graphs\, etc at
https://github.com/demerphq/smhasher
cheers\, Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 16 March 2017 at 21:02\, demerphq \demerphq@​gmail\.com wrote:
CRC is an incredibly bad base unit of processing for hashes\, the exact properties that make it so desirable as a CRC make it incredibly undesirable as a hash. Using it via the Intel intrinsic __mm_crc32_u64() is especially irresponsible. That operation takes an 64 bit value and calculates a 32 bit CRC from it. That guarantees 2**32 possible collisions per block. You only need to look at about 160K random blocks before you find a collision
Can we expect to see -DPERL_HASH_FUNC_CRC in the future anyway\, with sufficient warnings around its use in this regard?
Obviously this is a security risk if your Perl instance is exposed on a public interface.
But I'm very much not concerned about that threat with no public interface\, I just want my Perl to run fast.
How likely is it I'm going to run 160k blocks locally in a non-threat scenario by accident?
How bad are the side effects of that going to be realistically?
To the extent if it were feasible\, I'd champion adding 2 top level configure arguments that picked specific defaults
-Dprofile=fast-but-insecure ( no taint support maybe\, a fast hash implementation \, no stack protectors etc\, ) -Dprofile=secure-but-slow ( expensive enhancements to taint support\, massively slow but impenetrable hash function \, etc )
( where the default would be neither of those things and would give the current P5P balance-of-compromises )
Cos Its frustrating the on-going conflict we have between speed and security\, where maintainership ends up picking which of those things people will end up getting.
We don't have to force everyone to compromise\, surely?!
NB: Currently using PERL_HASH_FUNC_SDBM because I /do not care/ \, but its the fastest hash on my hardware.
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 16 March 2017 at 19:48\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 16 March 2017 at 21:02\, demerphq \demerphq@​gmail\.com wrote:
CRC is an incredibly bad base unit of processing for hashes\, the exact properties that make it so desirable as a CRC make it incredibly undesirable as a hash. Using it via the Intel intrinsic __mm_crc32_u64() is especially irresponsible. That operation takes an 64 bit value and calculates a 32 bit CRC from it. That guarantees 2**32 possible collisions per block. You only need to look at about 160K random blocks before you find a collision
Can we expect to see -DPERL_HASH_FUNC_CRC in the future anyway\, with sufficient warnings around its use in this regard?
Er\, no. I don't think so.
Obviously this is a security risk if your Perl instance is exposed on a public interface.
But I'm very much not concerned about that threat with no public interface\, I just want my Perl to run fast.
How likely is it I'm going to run 160k blocks locally in a non-threat scenario by accident?
I don't think you understood the point. An attacker need only look at 160K blocks\, on their own time\, in his own space\, to generate *two* blocks. They then produce keys of the same length\, with different mixtures of the two blocks IOW\, expand an integer of a certain bit size\, such that each bit 0 is represented by block 0\, bit 1 is represented by block 1. If you want 2**10 collision you need to produce keys that are 10 blocks long.
How bad are the side effects of that going to be realistically?
Excessive collisions will noticable affect performance.
To the extent if it were feasible\, I'd champion adding 2 top level configure arguments that picked specific defaults
-Dprofile=fast-but-insecure ( no taint support maybe\, a fast hash implementation \, no stack protectors etc\, ) -Dprofile=secure-but-slow ( expensive enhancements to taint support\, massively slow but impenetrable hash function \, etc )
( where the default would be neither of those things and would give the current P5P balance-of-compromises )
Currently you can select the hash function you want to build with.
Cos Its frustrating the on-going conflict we have between speed and security\, where maintainership ends up picking which of those things people will end up getting.
And where we lean towards secure over fast. I understand your concern.
We don't have to force everyone to compromise\, surely?!
NB: Currently using PERL_HASH_FUNC_SDBM because I /do not care/ \, but its the fastest hash on my hardware.
I find that quite surprising. Also unfortunate\, I removed support for sdbm recently. sdbm is a "one-at-a-time" hash with terrible properties. I would be really surprised if one of the modern block oriented alternatives is not markedly faster. Actually I wouldnt be surprised if *SipHash13* was faster.
I hope to provide some better alternatives soon.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 17 March 2017 at 09:20\, demerphq \demerphq@​gmail\.com wrote:
. An attacker
The question is: How does this attacker mount their attack against my code.
If they're relying on some other exploit that allows executing arbitrary perl locally\, then I already have a problem much graver than denial of service through memory exhaustion.
The only path I can think of here would be a malicious user shipping CPAN code which used a handful of known bad hash key values\, and then inspiring me to run them.
But I'm not running any services or software that deals with 3rd parties\, so I'm at a loss to what this attack surface looks like in reality.
I understand the theoretical problem\, I just can't find any place it would be applied.
And where we lean towards secure over fast. I understand your concern.
To be clear\, its not a "leaning" for me\, its a "usecase".
For local-access only on developmental-only tasks\, fast is preferable\, because I have the security angle handled by other things surrounding the other parts of my system that touch the exploit paths. I deal predominantly with things where startup time and fast execution is important\, because nothing is long lived.
But if I *was* building a web service that wanted external access\, you can bet I'd be opting for slow things\, maybe things as slow as hash key tainting\, because in that scenario\, Perl is "The front line"\, and thus needs to have the security beefed up. And things like startup performance is less of a concern here because it gets ameliorated over the long life.
Just the tools at my disposal to take this to these extremes is a little underwhelming.
Hash Key taint support is something nobody can presently opt in to\, because the argument keeps being they're too slow for general use.
But on the other side\, fast-but-insecure hash algorithms aren't supported either\, because they're too insecure.
Why can't it just be "give us both of these things and then fence them off so people can make decisions about it instead?" while making neither default and making sure you document how to / how not to use them.
You could very much argue if I wanted those things I could spend time and try to work out how to hack P5P myself\, and then publish a fork for other people who cared about these things... I just don't see that line of progress being helpful to anyone\, and the maximum utility is that this stuff be in P5P surrounded by black and yellow stripes.
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 16 March 2017 at 21:34\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 17 March 2017 at 09:20\, demerphq \demerphq@​gmail\.com wrote:
. An attacker
The question is: How does this attacker mount their attack against my code.
By getting you to hash something causes degenerate behavior in your code.
CPAN code which used a handful of known bad hash key values\, and then inspiring me to run them.
Yes\, I get you.
But I'm not running any services or software that deals with 3rd parties\, so I'm at a loss to what this attack surface looks like in reality.
I understand\, in many ways the *security* aspect only applies to scenarios where there might actually be an attacker.
I understand the theoretical problem\, I just can't find any place it would be applied.
Well\, I can definitely think of many use cases where "attack" is not an issue\, and I understand a user would want the option to choose speed over any security risk.
And where we lean towards secure over fast. I understand your concern.
To be clear\, its not a "leaning" for me\, its a "usecase".
I am not sure we are on the same page here. What I mean is that Perl is a *general purpose programming language*. General use implies suitability to process untrusted data. Therefore given the balance of concerns such as performance\, portability\, security\, etc\, for the default we will probably always be inclined towards giving a bit more security if it doesnt cost too much performance.
Having said that\, I think there are some different issues here that require some untangling.
First off\, we want to find a happy medium between "secure enough"\, and "fast". We also have to consider portability\, architecture (in)dependence\, and things like that.
Another thing is that you seem to think that the trade off is "speed" versus "security". IMO it really is not that simple. A good hash is both fast AND reasonably secure.
Also just for the record\, there is a distinction in my mind and usage between "secure" and "cryptographic grade". IMO "secure" implies that the hash function suitably hides the seed\, that it is free from obvious or known multi-collision attacks.
Also\, I think its worth considering that there is a relationship between "secure enough" and "good general purpose hashing function". Both demand that the hash behave like a random function. A hash that is insecure basically is just a hash that doesn't hash very well for some input set\, ergo not a good general purpose hash function. So for me\, it doesn't make sense in a general purpose context like a programming language to expose a hash function that hashes incredibly poorly for a given use case.
For local-access only on developmental-only tasks\, fast is preferable\, because I have the security angle handled by other things surrounding the other parts of my system that touch the exploit paths.
Playing devils advocate here\, how can you say that when you already said you don't know how an exploit would work?
I deal predominantly with things where startup time and fast execution is important\, because nothing is long lived.
Well if speed is your concern then IMO using sdbm is a poor choice unless you are predominently hashing strings under 3 characters long. Also\, sdbm is really not a very good hash function. It does not mix the last bytes of the key properly\, very few bits of the key actually affect the hash result. I would assume that for many tasks you are getting more collisions than you should.
But if I *was* building a web service that wanted external access\, you can bet I'd be opting for slow things\, maybe things as slow as hash key tainting\, because in that scenario\, Perl is "The front line"\, and thus needs to have the security beefed up. And things like startup performance is less of a concern here because it gets ameliorated over the long life.
The front line can be something as simple rpm\, or a sysadmin management tool that looks for duplicate files\, or something that hashes a dictionary\, or whatever. There are far far too many ways to trick someone into hashing keys that you control to achieve a degradation of service.
Just the tools at my disposal to take this to these extremes is a little underwhelming.
Hash Key taint support is something nobody can presently opt in to\, because the argument keeps being they're too slow for general use.
But on the other side\, fast-but-insecure hash algorithms aren't supported either\, because they're too insecure.
Well\, no. Fast but insecure hash algorithms are rejected because insecure == bad at hashing general use case data. As the old saw goes\, its easy to be fast if you don't have to be correct.
Why can't it just be "give us both of these things and then fence them off so people can make decisions about it instead?" while making neither default and making sure you document how to / how not to use them.
That is what I have always been trying to do\, however I would not put it in such black and white terms. I believe that we should offer a number of options along the spectrum of secure and fast. But I also believe that the options we provide should /be good hashes/. I dont believe that a hash with a multicollision attack is a good hash. There will be some user out there who *accidentally* stumbles into the degenerate hashing mode.
I mean\, if you consider something much less controversial than hashing\, consider our policies on sorting. We don't by default expose a sort algorithm with degenerate behaviour modes. We trade off being slightly faster in the common case in return for not having terrible worse cases. IMO we should take the same strategy with our hash function. We should not expose hash functions which have a terrible worse case. And hashes with known-multicollision attacks have a provably terrible worse case\, ergo they are rejected out of hand.
Anway\, my feeling is that we should provide three options (compounded somewhat by having to support 32 bit and 64 bit).
We should provide a "believed to be really secure" hash function. SipHash24 is that. We should have a "good and fast" option\, and we should have a middle ground\, which I believe that SipHash13 is. I believe that there are some quality candidates for good and fast\, and I hope to get patches adding them to perl sometime sooner than later.
But really\, I wonder at choosing sdbm. From what I know\, pretty much any block oriented hash function is going to outperform a one-at-a-time once the string gets more than a few characters. Even Siphash24 outperforms sdbm once you get to 13/14 byte strings. Other block oriented hashes cross sdbm's line much sooner\, at 4 or 5 bytes. How comprehensive was your testing? How thorough was it?
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 17 March 2017 at 21:28\, demerphq \demerphq@​gmail\.com wrote:
But really\, I wonder at choosing sdbm. From what I know\, pretty much any block oriented hash function is going to outperform a one-at-a-time once the string gets more than a few characters. Even Siphash24 outperforms sdbm once you get to 13/14 byte strings. Other block oriented hashes cross sdbm's line much sooner\, at 4 or 5 bytes. How comprehensive was your testing? How thorough was it?
Its been a long time now\, so its hard to say
I recall I used Benchmark::Perl::CoreHashes at some point and recall seeing it tell me one of the siphash would be faster\, but then I built a gaggle of perls with different hash functions and then compared how well they loaded Dist::Zilla ( reams and reams of Moose stuff )
SDBM was substantially quicker at that use case for me\, so I just used that one.
I'll clearly have to re-evaluate though because apparently\, now that I'm setting the -D flag for it still\, I expect its silently doing nothing?. https://rt.cpan.org/Public/Bug/Display.html?id=119566 . INSTALL still mentions SDBM being a thing\, but 236a70292a4ef354958701000e8897894141eb26 says SDBM\, DJB2 SUPERFAST MURMUR3's and a few OAAT hashes are gone.
( I suspect that's a perk of reinis's approach to configure is that it can tell you easily about invalid arguments )
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 17 March 2017 at 10:28\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 17 March 2017 at 21:28\, demerphq \demerphq@​gmail\.com wrote:
But really\, I wonder at choosing sdbm. From what I know\, pretty much any block oriented hash function is going to outperform a one-at-a-time once the string gets more than a few characters. Even Siphash24 outperforms sdbm once you get to 13/14 byte strings. Other block oriented hashes cross sdbm's line much sooner\, at 4 or 5 bytes. How comprehensive was your testing? How thorough was it?
Its been a long time now\, so its hard to say
I recall I used Benchmark::Perl::CoreHashes at some point and recall seeing it tell me one of the siphash would be faster\, but then I built a gaggle of perls with different hash functions and then compared how well they loaded Dist::Zilla ( reams and reams of Moose stuff )
SDBM was substantially quicker at that use case for me\, so I just used that one.
Interesting. It's a pity you dont have a record of better numbers.
I'll clearly have to re-evaluate though because apparently\, now that I'm setting the -D flag for it still\, I expect its silently doing nothing?. https://rt.cpan.org/Public/Bug/Display.html?id=119566 . INSTALL still mentions SDBM being a thing\, but 236a70292a4ef354958701000e8897894141eb26 says SDBM\, DJB2 SUPERFAST MURMUR3's and a few OAAT hashes are gone.
I will review INSTALL. I removed a bunch of the hashes as maintaining hv_func was getting out of hand.
( I suspect that's a perk of reinis's approach to configure is that it can tell you easily about invalid arguments )
Nod.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 17 March 2017 at 23:07\, demerphq \demerphq@​gmail\.com wrote:
Interesting. It's a pity you dont have a record of better numbers.
The really odd thing now is how this happens: I updated the patches of Benchmark::Perl::CoreHashes to map to the current hashes in 5.25.11
# time for PERL_HASH_FUNC_SIPHASH is 4.51946496963501 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_HARD is 2.94429302215576 # time for PERL_HASH_FUNC_SIPHASH13 is 3.59934282302856 # time for PERL_HASH_FUNC_HYBRID_OAATHU_SIPHASH13 is 3.2349910736084
^ These are "time taken for 5 runs of some loop entirely in C" and their times are very stable ( especially how they compare to each other is a seemingly constant ordering and difference )
I believe these say of the hash algos in core\, for the test case it provided\, OAATH is the fastest.
Granted\, this benchmark is _not_ factoring in for collision cost\, its just measuring raw op-time of the hashes\, so a bad hash algorithm that just returned "4" every time would fare incredibly well in this test.
But the good news is I did record the outcome of a previous run in an RT ticket somewhere :D
# time for PERL_HASH_FUNC_SIPHASH is 12.3072559833527 \<-- gosh what happened here. # time for PERL_HASH_FUNC_SDBM is 1.7642650604248 \<--- ha\, maybe I remembered this wrong. # time for PERL_HASH_FUNC_DJB2 is 2.15215301513672 # time for PERL_HASH_FUNC_SUPERFAST is 3.48679804801941 # time for PERL_HASH_FUNC_MURMUR3 is 5.11463499069214 # time for PERL_HASH_FUNC_ONE_AT_A_TIME is 4.16085195541382 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_HARD is 7.88369512557983 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_OLD is 3.99120020866394 # time for PERL_HASH_FUNC_MURMUR_HASH_64A is 3.86607193946838 # time for PERL_HASH_FUNC_MURMUR_HASH_64B is 2.71661996841431
Maybe I'm just gambling on most hashes not having even as many as 20 keys and so hoping that the odds of collisions are low enough to justify a weak hash.
I mean... a hash algorithm that always returns 4 is fine if your hash table never needs to handle more than one key right? :P ( not entirely serious there )
Its really weird how much the baseline speed has changed though in the last so-long. Maybe that's a stack-protected build or something\, or maybe my compiler got better. Can't think of many things that would radically change hash performance of a single algorithm.
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 17 March 2017 at 11:22\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 17 March 2017 at 23:07\, demerphq \demerphq@​gmail\.com wrote:
Interesting. It's a pity you dont have a record of better numbers.
The really odd thing now is how this happens: I updated the patches of Benchmark::Perl::CoreHashes to map to the current hashes in 5.25.11
# time for PERL_HASH_FUNC_SIPHASH is 4.51946496963501 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_HARD is 2.94429302215576 # time for PERL_HASH_FUNC_SIPHASH13 is 3.59934282302856 # time for PERL_HASH_FUNC_HYBRID_OAATHU_SIPHASH13 is 3.2349910736084
^ These are "time taken for 5 runs of some loop entirely in C" and their times are very stable ( especially how they compare to each other is a seemingly constant ordering and difference )
I believe these say of the hash algos in core\, for the test case it provided\, OAATH is the fastest.
I have no idea what those benchmarks say beyond that they are such a gross simplification of the issues at hand that they are nearly worthless.
You can't reduce a complex discussion about performance to a single benchmark done in artificial conditions. That hashes a list of about 20 odd strings\, none of which are longer than 6 bytes.
The performance of a hash function is dependent on the length of the keys being hashed. So effectively this benchmark says that when hashing short keys OAATH still wins.
This is not a new observation. We know that. Its why we have the SIPHASH13_OAATHU thing.
On the other hand\, it can be ameliorated somewhat. I have the following benchmark data (done IMO "properly" compared to the benchmark you are doing)
# JenkinsOAATH 0 byte keys 11.000 c/h # JenkinsOAATH 1 byte keys 32.000 c/h 32.000 c/b 0.031 b/c # JenkinsOAATH 2 byte keys 37.000 c/h 18.500 c/b 0.054 b/c # JenkinsOAATH 3 byte keys 42.000 c/h 14.000 c/b 0.071 b/c # JenkinsOAATH 4 byte keys 47.000 c/h 11.750 c/b 0.085 b/c # JenkinsOAATH 5 byte keys 52.000 c/h 10.400 c/b 0.096 b/c # JenkinsOAATH 6 byte keys 57.000 c/h 9.500 c/b 0.105 b/c # JenkinsOAATH 7 byte keys 62.000 c/h 8.857 c/b 0.113 b/c # JenkinsOAATH 8 byte keys 67.739 c/h 8.467 c/b 0.118 b/c # JenkinsOAATH 9 byte keys 72.000 c/h 8.000 c/b 0.125 b/c # JenkinsOAATH 10 byte keys 77.331 c/h 7.733 c/b 0.129 b/c # JenkinsOAATH 11 byte keys 82.000 c/h 7.455 c/b 0.134 b/c # JenkinsOAATH 12 byte keys 87.000 c/h 7.250 c/b 0.138 b/c # JenkinsOAATH 13 byte keys 92.560 c/h 7.120 c/b 0.140 b/c # JenkinsOAATH 14 byte keys 97.000 c/h 6.929 c/b 0.144 b/c # JenkinsOAATH 15 byte keys 102.000 c/h 6.800 c/b 0.147 b/c # JenkinsOAATH 16 byte keys 107.402 c/h 6.713 c/b 0.149 b/c
# SipHash13 0 byte keys 20.000 c/h # SipHash13 1 byte keys 24.000 c/h 24.000 c/b 0.042 b/c # SipHash13 2 byte keys 27.000 c/h 13.500 c/b 0.074 b/c # SipHash13 3 byte keys 28.000 c/h 9.333 c/b 0.107 b/c # SipHash13 4 byte keys 29.882 c/h 7.471 c/b 0.134 b/c # SipHash13 5 byte keys 30.000 c/h 6.000 c/b 0.167 b/c # SipHash13 6 byte keys 31.171 c/h 5.195 c/b 0.192 b/c # SipHash13 7 byte keys 32.000 c/h 4.571 c/b 0.219 b/c # SipHash13 8 byte keys 28.802 c/h 3.600 c/b 0.278 b/c # SipHash13 9 byte keys 28.000 c/h 3.111 c/b 0.321 b/c # SipHash13 10 byte keys 29.000 c/h 2.900 c/b 0.345 b/c # SipHash13 11 byte keys 28.993 c/h 2.636 c/b 0.379 b/c # SipHash13 12 byte keys 29.969 c/h 2.497 c/b 0.400 b/c # SipHash13 13 byte keys 30.000 c/h 2.308 c/b 0.433 b/c # SipHash13 14 byte keys 30.999 c/h 2.214 c/b 0.452 b/c # SipHash13 15 byte keys 31.189 c/h 2.079 c/b 0.481 b/c # SipHash13 16 byte keys 33.471 c/h 2.092 c/b 0.478 b/c
So SipHash13 *should* be beating JenkinsOAATH already\, however I suspect it is not linking in SipHash quite as intelligently as it could.
Granted\, this benchmark is _not_ factoring in for collision cost\, its just measuring raw op-time of the hashes\, so a bad hash algorithm that just returned "4" every time would fare incredibly well in this test.
Yes. A benchmark like this is fraught with issues. That being one of them.
But the good news is I did record the outcome of a previous run in an RT ticket somewhere :D
# time for PERL_HASH_FUNC_SIPHASH is 12.3072559833527 \<-- gosh what happened here. # time for PERL_HASH_FUNC_SDBM is 1.7642650604248 \<--- ha\, maybe I remembered this wrong. # time for PERL_HASH_FUNC_DJB2 is 2.15215301513672 # time for PERL_HASH_FUNC_SUPERFAST is 3.48679804801941 # time for PERL_HASH_FUNC_MURMUR3 is 5.11463499069214 # time for PERL_HASH_FUNC_ONE_AT_A_TIME is 4.16085195541382 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_HARD is 7.88369512557983 # time for PERL_HASH_FUNC_ONE_AT_A_TIME_OLD is 3.99120020866394 # time for PERL_HASH_FUNC_MURMUR_HASH_64A is 3.86607193946838 # time for PERL_HASH_FUNC_MURMUR_HASH_64B is 2.71661996841431
Maybe I'm just gambling on most hashes not having even as many as 20 keys and so hoping that the odds of collisions are low enough to justify a weak hash.
In the limited testing I have done hash keys are in the 6-24 byte lengths.
I mean... a hash algorithm that always returns 4 is fine if your hash table never needs to handle more than one key right? :P ( not entirely serious there )
Its really weird how much the baseline speed has changed though in the last so-long. Maybe that's a stack-protected build or something\, or maybe my compiler got better. Can't think of many things that would radically change hash performance of a single algorithm.
I really wouldnt put much stock in this benchmark. Our hash functions are not run in tight loops. Our hash functions dont need to be seeded every time we hash\, we can do it once per process.
When we decided to accept the latest hash function choices it was after running trace level monitoring of actually how many instructions were executed\, and then chose the ones that executed the least opts\, keeping in mind the bias towards short keys.
If you want to do some benchmarks then do some real benchmarks\, build some real software with different hash functions and see how it performs. If you cant see a performance difference IMO pick the one with better security. If you can see a performance difference then pick the one that suits your needs.
But relying on *that* benchmark alone to decide which hash function you use is not the right thing to do.
I have lots of performance data and graphs in my version of smhasher. You might want to review some of them:
https://github.com/demerphq/smhasher/tree/master/doc/graph https://github.com/demerphq/smhasher/tree/master/doc
cheers\, Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 18 March 2017 at 21:52\, demerphq \demerphq@​gmail\.com wrote:
But relying on *that* benchmark alone to decide which hash function you use is not the right thing to do.
I have lots of performance data and graphs in my version of smhasher. You might want to review some of them:
Yeah\, The point is I wasn't. It was just an interesting data point for comparison.
These benchmarks represent a limit\, because they demonstrate the /maximum/ performance a given algorithm can have on the given platform.
All the other stuff on top of that will only slow the algorithm down further ( collisions )
Its interesting in the "for X iterations of a data set with average lengths at ~6 bytes\, the throughput cannot exceed X"
Throughput in reality _WILL_ be lower than all of the stated rates.
But if you look at this:
# PERL_HASH_FUNC_SDBM is 1.7642650604248 # PERL_HASH_FUNC_ONE_AT_A_TIME_HARD is 7.88369512557983
^ that says to me that for the given sample\, I can afford to have quite a few collisions before the costs of the collision outweigh the inferior algorithm.
Its not something to base a final decision on\, but its still interesting enough for exploratory guidance.
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 18 March 2017 at 10:18\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 18 March 2017 at 21:52\, demerphq \demerphq@​gmail\.com wrote:
But relying on *that* benchmark alone to decide which hash function you use is not the right thing to do.
I have lots of performance data and graphs in my version of smhasher. You might want to review some of them:
Yeah\, The point is I wasn't. It was just an interesting data point for comparison.
No\, it is not interesting\, it is misleading to the point of being counter productive.
These benchmarks represent a limit\, because they demonstrate the /maximum/ performance a given algorithm can have on the given platform.
You keep talking about "maximum performance". To the extent that there is such a thing the maximum performance of most hash function is reached at string lengths much longer than that benchmark exercises. For instance a hash like CityHash can reach 5 bytes a cycle\, where a hash like sdbm never goes above .350 bytes a cycle\, and something like JenkinsOAAT never goes above .150 bytes a cycle.
You keep trying to reduce the comparison of two linear equations down to the comparison of their position at one point on the line. That doesnt make sense.
Either you say\, "I have sampled a large number of keys\, and this is the relative frequency of the various string lengths occuring\, and this benchmark simulates that distribution and gives me a summary view of the performance I can expect"\, or you go the simple route that smhasher does and you time each length for many lengths\, and then you graph that.
The benchmark you keep quoting does neither. The keys it hashes are not representative of the lengths of keys as they occur in the wild. And it does not systematically time hashing strings of different lengths. It is also subject to parallelism optimisations that would easily make a small hash function like OAAT appear faster.
Add some long string to the benchmark. Make the benchmark update the keys each time with the hash of the previous key\, so it has to do them in serial. Etc\, and you will see the numbers wildly change.
That benchmark is literally not worth the paper it is written on. Using it as a decision making basis for anything is the height of folly.
Further\, if you are going to quote benchmarks the way you are you should make sure you are clear about what hardware and build options you are using for the test. Are you on 64 bit? Windows or Linux? Debugging or non? For instance one of the timings you quoted looked to me like it could be the result of us using a macro to load 64 bits worth of data into a register which under DEBUGGING would turn siphash into a one-at-a-time\, but under -O3 would be optimised into a single instruction.
Cheers\, Yves
perl -Mre=debug -e "/just|another|perl|hacker/"
On 19 March 2017 at 01:22\, demerphq \demerphq@​gmail\.com wrote:
Further\, if you are going to quote benchmarks the way you are you should make sure you are clear about what hardware and build options you are using for the test. Are you on 64 bit? Windows or Linux? Debugging or non? For instance one of the timings you quoted looked to me like it could be the result of us using a macro to load 64 bits worth of data into a register which under DEBUGGING would turn siphash into a one-at-a-time\, but under -O3 would be optimised into a single instruction.
So I built a few Perl's just to compare their test run times\, as that should be interesting. ( Even though not comprehensive )
5.25.10 Siphash 13 self-test: Files=2557\, Tests=1231311\, 752 wallclock secs (302.66 usr 45.41 sys + 1440.63 cusr 209.55 csys = 1998.25 CPU)
5.25.10 Siphash 13-Hybrid self-test: Files=2557\, Tests=1231383\, 702 wallclock secs (307.18 usr 45.35 sys + 1435.14 cusr 206.05 csys = 1993.72 CPU)
5.25.10 OAAT self-test: Files=2557\, Tests=1231310\, 745 wallclock secs (314.44 usr 47.48 sys + 1488.43 cusr 211.39 csys = 2061.74 CPU)
5.24.1 SDBM self-test: Files=2410\, Tests=851196\, 399 wallclock secs (179.73 usr 29.91 sys + 1037.96 cusr 150.77 csys = 1398.37 CPU)
5.24.1 OAAT self-test: Files=2410\, Tests=851085\, 434 wallclock secs (197.12 usr 32.36 sys + 1114.02 cusr 160.08 csys = 1503.58 CPU)
A difference of 105 CPU Seconds is nothing to sneeze at :) and I believe this comparison may have been the one that sealed the deal for me.
CFLAGS: -fstack-protector-strong -fno-stack-protector -O3 -march=native -mtune=native
The first two of those is just the only way to actually disable the stack protector without configure helpfully turning it back on again.
Non-debugging non-threaded perl.
The only other things I do that are slightly interesting is turning on -DPERL_DISABLE_PMC and disabling man page generation ... but I'd hate to think they're relevant. Configuration for all above builds were identical sans HASH implementation\, including passing -Udefault_inc_excludes_dot to configure despite none of the perl's needing it. ( just part of my build script )
x86-64 Linux on Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz
gcc version 5.4.0 (Gentoo 5.4.0-r3 p1.3\, pie-0.6.5)
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 19 March 2017 at 23:00\, Kent Fredric \kentfredric@​gmail\.com wrote:
So I built a few Perl's just to compare their test run times\, as that should be interesting. ( Even though not comprehensive )
I've now run 5x runs of Moose's test suite on all of the target perls\, and a few apparent trends appear.
Graph comparing times in user-seconds: http://i.imgur.com/jRs4EBZ.png Data: https://gist.github.com/kentfredric/d52a9dc8f9a21587102eb73545815f9c
I can only really afford to do so many runs\, 5 targets * 5 runs * 2 minutes per run -> there's my computer out of use for an hour.
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
On 20 March 2017 at 01:02\, Kent Fredric \kentfredric@​gmail\.com wrote:
On 19 March 2017 at 23:00\, Kent Fredric \kentfredric@​gmail\.com wrote:
So I built a few Perl's just to compare their test run times\, as that should be interesting. ( Even though not comprehensive )
I've now run 5x runs of Moose's test suite on all of the target perls\, and a few apparent trends appear.
Graph comparing times in user-seconds: http://i.imgur.com/jRs4EBZ.png Data: https://gist.github.com/kentfredric/d52a9dc8f9a21587102eb73545815f9c
I can only really afford to do so many runs\, 5 targets * 5 runs * 2 minutes per run -> there's my computer out of use for an hour.
Test Times for Dzil are inconclusive\, mostly because you've got 2 other factors to contend with really:
- 5.25.10 has newer Test::More ( slower ) - 5.25.10 has context stack reworks ( Faster )
Graph: http://i.imgur.com/Mow6oyt.png Data: https://gist.github.com/kentfredric/cab5287dbecc4fd980b6624d08fcf474
However\, dzil *build* times get the Test::More slowdown out of the mix and give some really interesting data\, because you can explain the differences in OAAT speeds exclusively in terms of 5.25-optimisations.
This reveals Siphash13 to be promising in this scenario\, but hybrid_siphash to be underwhelming.
It also demonstrates SDBM hash to be faster than OAATH for this usecase.
Graph: http://i.imgur.com/ghMCFB5.png Data: https://gist.github.com/kentfredric/fd9ee4ff55351f3e16122f62d663c26b
The only curiosity would be how SDBM compares to SIPHASH13 with all the other 5.25 optimizations in.
( which is why I double-accounted OAATH everywhere because it was the only way I could get a relative metric )
So if SDBM aint' returning\, I'll be switching to SIPHASH13
25% off `dzil build`? Yes please :)
-- Kent
KENTNL - https://metacpan.org/author/KENTNL
@demerphq is there any further action to take on this ticket?
On Fri, 31 Jan 2020 at 06:33, Todd Rinaldo notifications@github.com wrote:
@demerphq https://github.com/demerphq is there any further action to take on this ticket?
Hrm... Lets leave the ticket open for a bit longer. I think we should make Siphash 1-3 the default these days but i havent had time to lobby for it.
Yves
Migrated from rt.perl.org#124064 (status was 'open')
Searchable as RT124064$