Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.95k stars 554 forks source link

[PATCH] 3db6cbfc - SIPHASH is very slow on 32 bit x86 #12638

Closed p5pRT closed 4 years ago

p5pRT commented 11 years ago

Migrated from rt.perl.org#116054 (status was 'open')

Searchable as RT116054$

p5pRT commented 11 years ago

From @bulk88

Created by @bulk88

summary​: SIPHASH on 32 bit processors without native CPU 64 bit integers (such as 32 bit x86) is atleast an order of magnitude slower than one at a time hash

The discussion in http​://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html about commit 3db6cbfca39da94d152d3e860e2aa79b9c6bb16 "Switch default hash to SIPHASH on 64 bit builds and ONE_AT_A_TIME on 32 bit builds" seems to have flickered out. I feel its time for the issue to be a bug report and not a ML discussion so its not forgotten. I don't think Configure has a way of figuring out what is the largest native CPU integral type\, and I am not sure if there are any GCC or partially portable macros that can be tested to learn the CPU type or learn if a particular compiler O level has native 64 bit ints or not. IDK if any other FOSS project has a framework for figuring these things out (I imagine a large hand edited header file would work if there are enough maintainers with enough CPUs and compilers/platforms\, for example take something out of GCC's code base where I'm sure (guess) all this info exists). As the next best thing\, instead of checking for U64 type/macro\, maybe check if sizeof(void *) == 8 and U64\, only then use SIPHASH\, but that is bias towards traditional x86 and doesn't cover clever tricks like http​://en.wikipedia.org/wiki/X32_ABI (where SIPHASH should be picked even though sizeof(void *) == 4) or the equivalent on other CPUs.

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.17.7: Configured by Owner at Thu Dec 6 20:21:41 2012. Summary of my perl5 (revision 5 version 17 subversion 7 patch blead 2012-12-06.16:42:20 93a641ae382638ffd1980378be4810244d04f4b0 v5.17.6-186-g93a641a) configuration: Snapshot of: 93a641ae382638ffd1980378be4810244d04f4b0 Platform: osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cl', ccflags ='-nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -GL -G7 -DWIN32 -D_CONSOLE -DNO_STRICT -DPERL_TEXTMODE_SCRIPTS -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -D_USE_32BIT_TIME_T', optimize='-MD -Zi -DNDEBUG -O1 -GL -G7', cppflags='-DWIN32' ccversion='13.10.6030', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=8 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"c:\perl517\lib\CORE" -machine:x86' libpth="C:\Program Files\Microsoft Visual Studio .NET 2003\VC7\lib" libs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib perllibs=oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib comctl32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl517.lib gnulibc_version='' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -ltcg -libpath:"c:\perl517\lib\CORE" -machine:x86' Locally applied patches: @INC for perl 5.17.7: C:/perl517/site/lib C:/perl517/lib . Environment for perl 5.17.7: HOME (unset) LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\perl517\bin;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\IDE;C:\Program Files\Microsoft Visual Studio .NET 2003\VC7\BIN;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools\bin\prerelease;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\system32\wbem; PERL_BADLANG (unset) SHELL (unset) ```
p5pRT commented 11 years ago

From @demerphq

My current branch checks if HAS_QUAD is defined.

Yves

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @bulk88

On Tue Dec 11 18​:19​:54 2012\, bulk88 wrote​:

This is a bug report for perl from bulk88@​hotmail.com\, generated with the help of perlbug 1.39 running under perl 5.17.7.

----------------------------------------------------------------- [Please describe your issue here]

summary​: SIPHASH on 32 bit processors without native CPU 64 bit integers (such as 32 bit x86) is atleast an order of magnitude slower than one at a time hash

See attached patch. I tried it on VC 2003 32 bit Win32 Perl\, no Quads\, and on VC 2003 32 bit Win32 Perl\, with Quads. Asm showed one_at_a_time being used on the with Quads Perl. I didn't do a make test on the Perls for this patch.

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @bulk88

0001-don-t-use-SIPHASH-on-x86-32-and-non-native-int-64-pl.patch ```diff From 22fb3e894dc7ba2f922861c955e0d5239fb4cf5a Mon Sep 17 00:00:00 2001 From: Daniel Dragan Date: Sun, 23 Dec 2012 03:03:12 -0500 Subject: [PATCH] don't use SIPHASH on x86-32 and non-native int 64 platforms SIPHASH performance 7x slower than ONE_AT_A_TIME on i386 in my testing. On AMD64 its 2x slower. This commit is related to commit 3db6cbfca3 and fixes Perl RT #116054. Details on speed claims are in http://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html . Any compiler can emulate any integer. Since the x86-32/i386 instruction set has no 64 bit instructions, all compilers will emulate 64 bit integers using multiple 32 bit instructions, and possibly calls to a static linked math library. This severely degrades performance of the hashing function. On a 32 bit Perl with Quads on i386, the slower SIPHASH was used previously. Now, 32 bit Perl with Quads on i386 will use the faster native ONE_AT_A_TIME instead of SIPHASH. Other CPUs and macros should be added in the future to the list, but it should not be done without looking at the assembly output of compilers and knowledge of the typical compiler flags for that platform, and knowledge of the instruction set for that platform. Obviously any CPU with 64 bit pointers has 64 bit operand instructions. If a CPU or platform has emulated 64 bit pointers that will be delt with when such a bug ticket comes. 32 bit CPUs that I dont know the details of were put in the comments. --- hv.h | 15 ++++++++++++++- 1 files changed, 14 insertions(+), 1 deletions(-) diff --git a/hv.h b/hv.h index 3ee2399..ac395f7 100644 --- a/hv.h +++ b/hv.h @@ -156,7 +156,20 @@ struct xpvhv { || defined(PERL_HASH_FUNC_ONE_AT_A_TIME_OLD) \ || defined(PERL_HASH_FUNC_BUZZHASH16) \ ) -#ifdef U64 +/* Porting note, add CPUs and macros which indicate native 64 bit ints, + or no native 64 bit ints. This is to not use SIPHASH on platforms with + compiler emulated 64 bit ints, ONE_AT_A_TIME works well with 32 bit CPUs. + See http://www.nntp.perl.org/group/perl.perl5.porters/2012/12/msg196172.html + for discussion on speed of these hash functions. + + sparc, powerpc, alpha, mips and arm with 32 bit pointers are missing from + this list, 32 bit x86 with AVX might one day have native 64 bit ints +*/ +#if defined(U64) && \ + (PTRSIZE == 8 \ + || defined(__ILP32__) \ + || ( !defined(i386) && !defined(__i386) && !defined(__i386__) \ + && !defined(_M_IX86) && !defined(__X86__) && !defined(_X86_))) #define PERL_HASH_FUNC_SIPHASH #else #define PERL_HASH_FUNC_ONE_AT_A_TIME -- 1.7.9.msysgit.0 ```
p5pRT commented 11 years ago

From @tonycoz

On Sun Dec 23 00​:09​:52 2012\, bulk88 wrote​:

On Tue Dec 11 18​:19​:54 2012\, bulk88 wrote​:

This is a bug report for perl from bulk88@​hotmail.com\, generated with the help of perlbug 1.39 running under perl 5.17.7.

----------------------------------------------------------------- [Please describe your issue here]

summary​: SIPHASH on 32 bit processors without native CPU 64 bit integers (such as 32 bit x86) is atleast an order of magnitude slower than one at a time hash

See attached patch. I tried it on VC 2003 32 bit Win32 Perl\, no Quads\, and on VC 2003 32 bit Win32 Perl\, with Quads. Asm showed one_at_a_time being used on the with Quads Perl. I didn't do a make test on the Perls for this patch.

Perl no longer uses SIPHASH by default\, even for 64-bit builds\, so this patch is obsolete.

I discussed this briefly in #p5p\, and bulk88 would like the ticket kept open for now.

Tony

p5pRT commented 11 years ago

From @bulk88

On Thu Jul 25 18​:39​:34 2013\, tonyc wrote​:

Perl no longer uses SIPHASH by default\, even for 64-bit builds\, so this patch is obsolete.

I discussed this briefly in #p5p\, and bulk88 would like the ticket kept open for now.

Tony

Since this ticket is about which is the best/fastest hash algo for perl\, I did some research on the issue. I published http​://www.cpantesters.org/distro/B/Benchmark-Perl-CoreHashes.html and testing DJB2 which was usually the fastest on my machine and most of the machines on cpantesters\, I discovered perl is severely broken with it. So this ticket is now blocked by #120208 now IMO.

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 9 years ago

From @tonycoz

On Sat Oct 19 13​:18​:45 2013\, bulk88 wrote​:

On Thu Jul 25 18​:39​:34 2013\, tonyc wrote​:

Perl no longer uses SIPHASH by default\, even for 64-bit builds\, so this patch is obsolete.

I discussed this briefly in #p5p\, and bulk88 would like the ticket kept open for now.

Tony

Since this ticket is about which is the best/fastest hash algo for perl\, I did some research on the issue. I published http​://www.cpantesters.org/distro/B/Benchmark-Perl-CoreHashes.html and testing DJB2 which was usually the fastest on my machine and most of the machines on cpantesters\, I discovered perl is severely broken with it. So this ticket is now blocked by #120208 now IMO.

#120208 was fixed in 54e07e2b21cb1f58c04d67bca2a311715ba8815e.

DJB2 probably isn't suitable as a default algorithm\, since the 32-bit space of seeds is too small to search for even an exhaustive search.

Tony

toddr commented 4 years ago

Given this ticket subject is not "What's the best hashing algorithm" and this discussion has died, I'm closing this case. If we want to discuss that in github issues, I suggest a new case.