Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 556 forks source link

Win32 /regexp/i performance degradation in fork'd code #9652

Open p5pRT opened 15 years ago

p5pRT commented 15 years ago

Migrated from rt.perl.org#63284 (status was 'open')

Searchable as RT63284$

p5pRT commented 15 years ago

From alex.davies@talktalk.net

Created by alex@Amelie

This is on Win32...

I noticed a slowdown in performance of C\< /pat/i > case insensitive regular expressions when used in fork code. Interestingly if the regexp was compiled within the fork'd code it ran at the expected rate. NB. the same results were obtained in either case - it was just the time taken that was different.

The following testcase shows an interesting difference in behaviour which i suspect is the culprit of the difference in time taken​:

# %\<

use re qw(Debug All);

my $pid = fork; defined $pid or die;

if ($pid == 0) { $_ = "abcNEEDLE123\n";

print STDERR "\n## 1 ##\n\n"; /needle/i;

print STDERR "\n## 2 ##\n\n"; eval q{ /needle/i };

exit; }

while (wait() != -1) {}

$_ = "abcNEEDLE123\n";

print STDERR "\n## 3 ##\n\n"; /needle/i;

# >%

And here is the output​:

# %\<

Compiling REx "needle" Final program​:   1​: EXACTF \ (4)   4​: END (0) stclass EXACTF \ minlen 6 Compiling REx "needle" Final program​:   1​: EXACTF \ (4)   4​: END (0) stclass EXACTF \ minlen 6

## 1 ##

Matching REx "needle" against "abcNEEDLE123%n"   0 \<> \ | 1​:EXACTF \(4)   failed...   1 \ \ | 1​:EXACTF \(4)   failed...   2 \ \ | 1​:EXACTF \(4)   failed...   3 \ \ | 1​:EXACTF \(4)   9 \ \<123%n> | 4​:END(0) Match successful!

## 2 ##

Compiling REx "needle" Final program​:   1​: EXACTF \ (4)   4​: END (0) stclass EXACTF \ minlen 6 Matching REx "needle" against "abcNEEDLE123%n" Matching stclass EXACTF \ against "abcNEEDLE123%n" (13 chars)   3 \ \ | 1​:EXACTF \(4)   9 \ \<123%n> | 4​:END(0) Match successful! Freeing REx​: "needle"

## 3 ##

Matching REx "needle" against "abcNEEDLE123%n" Matching stclass EXACTF \ against "abcNEEDLE123%n" (13 chars)   3 \ \ | 1​:EXACTF \(4)   9 \ \<123%n> | 4​:END(0) Match successful!

# >%

So it appears the /regexp/i is 'run' differently within the child thread if it was compiled in the main thread.

Additionally\, running the testcase on my Strawberry 5.10 perl also gave some assert warnings​:

Assertion ((svtype)((_svi)->sv_flags & 0xff)) >= SVt_PV failed​: file "re_exec.c"\, line 2561

Is this a bug\, or simply a necesary change in the code path taken to get /regexp/i to work in a threaded environment?

Thanks for taking a look.

Cheers\, alex.

Perl Info ``` Flags: category=core severity=low This perlbug was built using Perl 5.10.0 - Mon Aug 11 04:41:10 2008 It is being executed now by Perl 5.10.0 - Mon Dec 1 16:53:12 2008. Site configuration information for perl 5.10.0: Configured by alex at Mon Dec 1 16:53:12 2008. Summary of my perl5 (revision 5 version 10 subversion 0) configuration: Platform: osname=MSWin32, osvers=5.1, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef useithreads=define, usemultiplicity=define useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=undef, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cl', ccflags ='-nologo -GF -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX', optimize='-MD -Zi -DNDEBUG -O1', cppflags='-DWIN32' ccversion='12.00.8804', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='link', ldflags '-nologo -nodefaultlib -debug -opt:ref,icf -libpath:"c:\perl\lib\CORE" -machine:x86' libpth=C:\PROGRA~1\MICROS~4\VC98\lib libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=true, libperl=perl510.lib gnulibc_version='' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', ddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -libpath:"c:\perl\lib\CORE" -machine:x86' Locally applied patches: @INC for perl 5.10.0: C:/alex/src/perl/perl-5.10.0.tar/perl-5.10.0/lib . Environment for perl 5.10.0: HOME=C:\alex LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\PROGRA~1\MICROS~4\Common\msdev98\BIN;C:\PROGRA~1\MICROS~4\VC98\BIN;C:\PROGRA~1\MICROS~4\Common\TOOLS\WINNT;C:\PROGRA~1\MICROS~4\Common\TOOLS;C:\Program Files\Ruby-185-21\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\alex\bin;C:\Program Files\Microsoft SDK\Bin\.;C:\Program Files\Microsoft SDK\Bin\WinNT\.;C:\Program Files\Tcl-8.5.0\bin;C:\PerlAPPv9\bin;C:\Program Files\QuickTime\QTSystem\;C:\strawberry\c\bin;C:\strawberry\perl\bin;C:\Program Files\Microsoft Visual Studio\VC98\Bin;C:\Program Files\Microsoft SDK\Bin\.;C:\Program Files\Microsoft SDK\Bin\WinNT;C:\cygwin\bin PERL_BADLANG (unset) SHELL (unset) ```
p5pRT commented 15 years ago

From @iabyn

On Tue\, Feb 17\, 2009 at 10​:35​:44AM -0800\, alex.davies@​talktalk.net (via RT) wrote​:

I noticed a slowdown in performance of C\< /pat/i > case insensitive regular expressions when used in fork code.

I can confirm this is reproducible in bleed using Linux and threads​:

  use threads;   sub child {   $_ = "abcNEEDLE123\n";

  use re qw(Debug All);   print STDERR "\n## 1 ##\n\n";   /needle/i;

  print STDERR "\n## 2 ##\n\n";   eval q{ /needle/i };   }   threads->new(\&child)->join;

which outputs​:

  Compiling REx "needle"   Final program​:   1​: EXACTF \ (4)   4​: END (0)   stclass EXACTF \ minlen 6

  ## 1 ##

  Matching REx "needle" against "abcNEEDLE123%n"   0 \<> \ | 1​:EXACTF \(4)   failed...   1 \ \ | 1​:EXACTF \(4)   failed...   2 \ \ | 1​:EXACTF \(4)   failed...   3 \ \ | 1​:EXACTF \(4)   9 \ \<123%n> | 4​:END(0)   Match successful!

  ## 2 ##

  Compiling REx "needle"   Final program​:   1​: EXACTF \ (4)   4​: END (0)   stclass EXACTF \ minlen 6   Matching REx "needle" against "abcNEEDLE123%n"   Matching stclass EXACTF \ against "abcNEEDLE123%n" (13 chars)   3 \ \ | 1​:EXACTF \(4)   9 \ \<123%n> | 4​:END(0)   Match successful!

I see the same behaviour under 5.8.8\, so this isn't a 5.10.0 regression. (So I'll let someone else worry about fixing it!)

-- "But Sidley Park is already a picture\, and a most amiable picture too. The slopes are green and gentle. The trees are companionably grouped at intervals that show them to advantage. The rill is a serpentine ribbon unwound from the lake peaceably contained by meadows on which the right amount of sheep are tastefully arranged." -- Lady Croom\, "Arcadia"

p5pRT commented 15 years ago

The RT System itself - Status changed from 'new' to 'open'

khwilliamson commented 2 years ago

This persists in 5.35.10. I suspect it is one case doesn't use boyers-moore