Closed p5pRT closed 9 years ago
Using PERL_HASH_FUNC_DJB2 with miniperl\, map() passed all undef $_s to the block.
sample script ____________________________________
my @arr = ('num1'\,'num2'\,'num3'); map {print $_."\n";} @arr; ____________________________________
with DJB2 miniperl ____________________________________ C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32>..\miniperl.ex e -I..\lib -w ..\maptest.pl Use of uninitialized value $_ in concatenation (.) or string at ..\maptest.pl li ne 3.
Use of uninitialized value $_ in concatenation (.) or string at ..\maptest.pl li ne 3.
Use of uninitialized value $_ in concatenation (.) or string at ..\maptest.pl li ne 3.
C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32> ___________________________________
with regular/default hash perl 5.10
___________________________________________________ C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32>perl -w ..\map test.pl num1 num2 num3
C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32> ___________________________________________________
Patch of hv_func.h that shows how I switched hash funcs is attached.
This test script was cut down from write_buildcustomize.pl failing spectacularly when I tried to build a Win32 Perl with DJB2 hash. Later on basically no module can be found in @INC so strict.pm and warnings.pm can't be found.
C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32>..\miniperl.ex e -I..\lib -f ..\write_buildcustomize.pl .. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 12. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value in list assignment at dist/constant/lib/constant.pm l ine 15. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 114. Use of uninitialized value $file in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 120. Use of uninitialized value $path in pattern match (m//) at dist/Cwd/lib/File/Spe c/Win32.pm line 214. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48. Use of uninitialized value $_ in concatenation (.) or string at ..\write_buildcu stomize.pl line 48.
C:\Documents and Settings\Owner\Desktop\cpan libs\p519\perl\win32>
Changing the define to "#define PERL_HASH_FUNC_ONE_AT_A_TIME_OLD" does not produce the map returns undef problems mentioned in this ticket.
-- bulk88 ~ bulk88 at hotmail.com
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed); [16:32] \<@rurban> both are broken without this fix
untested by me\, put here for safe keeping
-- bulk88 ~ bulk88 at hotmail.com
bulk88 via RT wrote:
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed);
This apparently refers to the two lines
U32 hash = *((U32*)seed + len);
in hv_func.h\, in S_perl_hash_sdbm() and S_perl_hash_djb2()\, which look like they should each be
U32 hash = *((U32*)seed) + len;
as seen in S_perl_hash_superfast() and two others.
-zefram
The RT System itself - Status changed from 'new' to 'open'
On Sat Apr 12 03:43:15 2014\, zefram@fysh.org wrote:
bulk88 via RT wrote:
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed);
This apparently refers to the two lines
U32 hash = *((U32*)seed + len);
in hv_func.h\, in S_perl_hash_sdbm() and S_perl_hash_djb2()\, which look like they should each be
U32 hash = *((U32*)seed) + len;
sorry\, but this is nonsense. adding the len to the random seed is wrong. do it as I said. yves had the idea to add the seed to the key\, which is a different kind of nonsense\, but not the problem here.
as seen in S_perl_hash_superfast() and two others.
-zefram
-- Reini Urban
Reini Urban via RT wrote:
adding the len to the random seed is wrong. do it as I said.
Why is it wrong to add the length into the hash at that point?
-zefram
On Sat Apr 12 10:50:44 2014\, zefram@fysh.org wrote:
Reini Urban via RT wrote:
adding the len to the random seed is wrong. do it as I said.
Why is it wrong to add the length into the hash at that point?
We use a global hash seed. It makes no sense to mix that seed with the key length per key. You can mix all random junk into a hash function to make it random\, but don't call it hash function then. It also makes no sense to add the seed to the key\, but this is a different story.
This is my current version of the patch https://github.com/rurban/perl-hash-stats/blob/master/sdbm%2Bdjb2.patch -- Reini Urban
Reini Urban via RT wrote:
We use a global hash seed. It makes no sense to mix that seed with the key length per key.
I don't see how the scope of the seed is relevant to how the key length is treated in the hash function. What seems relevant is whether the hash algorithm has some susceptibility to collisions from related keys\, which perturbing the hash based on key length could avoid. For example\, the old old hash algorithm in Perl 5.6 always yielded hash("\0".$s) == hash($s)\, a state of affairs that could have been avoided by using the key length as an input to the hash.
In the case of these two hash functions\, DJB2 and SDBM\, including the length doesn't look essential. There's no trivial related-key collision that's independent of seed. However\, in both cases\, if the seed happens to be zero and the length is not included then hash("\0".$s) == hash($s) will hold for all $s. If an attacker manages to discover the one hash in 2^32 that has a zero seed\, an attack on that hash would be very easy. (The easiest way to discover it is to attempt the attack.) Adding the length to the seed ensures that hash("\0".$s) == hash($s) will hold at most for $s of one specific length per seed\, defeating this simple way of generating a large multi-key collision. It looks rather as though the length is added to avoid zero thus being a weak seed.
I don't know the real reason why these functions include the length. The commit message doesn't say. Perhaps Yves can comment.
-zefram
On 12 April 2014 22:02\, Zefram \zefram@​fysh\.org wrote:
Reini Urban via RT wrote:
We use a global hash seed. It makes no sense to mix that seed with the key length per key.
I don't see how the scope of the seed is relevant to how the key length is treated in the hash function. What seems relevant is whether the hash algorithm has some susceptibility to collisions from related keys\, which perturbing the hash based on key length could avoid. For example\, the old old hash algorithm in Perl 5.6 always yielded hash("\0".$s) == hash($s)\, a state of affairs that could have been avoided by using the key length as an input to the hash.
In the case of these two hash functions\, DJB2 and SDBM\, including the length doesn't look essential. There's no trivial related-key collision that's independent of seed. However\, in both cases\, if the seed happens to be zero and the length is not included then hash("\0".$s) == hash($s) will hold for all $s. If an attacker manages to discover the one hash in 2^32 that has a zero seed\, an attack on that hash would be very easy. (The easiest way to discover it is to attempt the attack.) Adding the length to the seed ensures that hash("\0".$s) == hash($s) will hold at most for $s of one specific length per seed\, defeating this simple way of generating a large multi-key collision. It looks rather as though the length is added to avoid zero thus being a weak seed.
I don't know the real reason why these functions include the length. The commit message doesn't say. Perhaps Yves can comment.
Your analysis is correct. It was made to harden these functions against multi-collision attacks and to decrease the chance we end up with a zero seed\, and if we do it will only be for one key length that should be unknown to an attacker.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On 12 April 2014 19:47\, Reini Urban via RT \perlbug\-followup@​perl\.org wrote:
On Sat Apr 12 03:43:15 2014\, zefram@fysh.org wrote:
bulk88 via RT wrote:
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed);
This apparently refers to the two lines
U32 hash = *((U32*)seed + len);
in hv_func.h\, in S_perl_hash_sdbm() and S_perl_hash_djb2()\, which look like they should each be
U32 hash = *((U32*)seed) + len;
sorry\, but this is nonsense.
No it is not. It is the correct fix to my change.
adding the len to the random seed is wrong.
Adding the length is not 100% faithful to the original algorithm. Nevertheless it hardens weak algorithms against simple attacks\, and reduces the chance of a 0 seed\, so we do so anyway.
See Zeframs mails for details.
do it as I said.
No\, don't. I will push patch for this\, and it will look like Zefram posted.
yves had the idea to add the seed to the key\, which is a different kind of nonsense\, but not the problem here.
I have no idea what you are talking about.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Apr 13\, 2014\, at 5:12 AM\, demerphq \demerphq@​gmail\.com wrote:
On 12 April 2014 19:47\, Reini Urban via RT \perlbug\-followup@​perl\.org wrote:
On Sat Apr 12 03:43:15 2014\, zefram@fysh.org wrote:
bulk88 via RT wrote:
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed);
This apparently refers to the two lines
U32 hash = *((U32*)seed + len);
in hv_func.h\, in S_perl_hash_sdbm() and S_perl_hash_djb2()\, which look like they should each be
U32 hash = *((U32*)seed) + len;
sorry\, but this is nonsense.
No it is not. It is the correct fix to my change.
It is nonsense nevertheless\, but it is the correct fix for the seed=0 case.
The real fix for collision attacks is to avoid O(n/2) collision lookup. You will never be able to avoid collisions at all and you can easily attack any hash function if the seed is known. It makes not much sense to perturb and slow down a hash function at all.
yves had the idea to add the seed to the key\, which is a different kind of nonsense\, but not the problem here.
I have no idea what you are talking about.
Bad\, because you wrote it and blogged about. To refresh your mind: http://perl5.git.perl.org/perl.git/blob/0c5ea01913265b717b8615a704acd13ddde5b078:/hv_func.h#l508
On Sat Apr 12 03:43:15 2014\, zefram@fysh.org wrote:
bulk88 via RT wrote:
[16:32] \<@rurban> PERL_HASH_FUNC_SDBM and PERL_HASH_FUNC_DJB2: simple fix: U32 hash = *((U32*)seed);
This apparently refers to the two lines
U32 hash = *((U32*)seed + len);
in hv_func.h\, in S_perl_hash_sdbm() and S_perl_hash_djb2()\, which look like they should each be
U32 hash = *((U32*)seed) + len;
as seen in S_perl_hash_superfast() and two others.
This was fixed by Yves's 54e07e2b21cb1f58c04d67bca2a311715ba8815e.
Closing.
Tony
@tonycoz - Status changed from 'open' to 'resolved'
Migrated from rt.perl.org#120208 (status was 'resolved')
Searchable as RT120208$