Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 531 forks source link

[regex] backref problem with quantified groups #8267

Open p5pRT opened 18 years ago

p5pRT commented 18 years ago

Migrated from rt.perl.org#38133 (status was 'open')

Searchable as RT38133$

p5pRT commented 21 years ago

From edi@agharta.de

Created by edi@agharta.de

  edi@​bird​:\~ > perl -e 'use Data​::Dumper; "a" =~ /((a)*)*/; print Dumper $1\, $2'   $VAR1 = '';   $VAR2 = undef;   edi@​bird​:\~ > perl -e 'use Data​::Dumper; "a" =~ /(((a))*)*/; print Dumper $1\, $2'   $VAR1 = '';   $VAR2 = 'a';

Obviously\, $2 should either be undef or 'a' in _both_ cases. I think we see this due to wrong optimizations and have posted a more detailed analysis to comp.lang.perl.misc​:

  \<http​://groups.google.com/groups?selm=87zns15gal.fsf%40bird.agharta.de&rnum=7>

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl v5.8.0: Configured by edi at Wed Nov 13 01:41:22 CET 2002. Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration: Platform: osname=linux, osvers=2.4.19-gentoo-r5, archname=i686-linux-thread-multi uname='linux bird.agharta.de 2.4.19-gentoo-r5 #5 wed aug 7 13:06:53 cest 2002 i686 genuineintel ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O3', cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='2.95.3 20010315 (release)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt -lutil perllibs=-lnsl -ldl -lm -lpthread -lc -lcrypt -lutil libc=/lib/libc-2.2.5.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.2.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Locally applied patches: @INC for perl v5.8.0: /usr/lib/site_perl/5.6.1 /opt/perl-5.8/lib/5.8.0/i686-linux-thread-multi /opt/perl-5.8/lib/5.8.0 /opt/perl-5.8/lib/site_perl/5.8.0/i686-linux-thread-multi /opt/perl-5.8/lib/site_perl/5.8.0 /opt/perl-5.8/lib/site_perl . Environment for perl v5.8.0: HOME=/home/edi LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/usr/local/lib: LOGDIR (unset) PATH=/usr/local/bin:/home/edi/.bin:/opt/opera/bin:/opt/scl/bin:/usr/kde/3/bin:/bin:/usr/bin:/usr/local/bin:/opt/Acrobat5:/opt/opera/bin:/opt/RealPlayer8:/usr/X11R6/bin:/opt/sun-jdk-1.4.0/bin:/opt/sun-jdk-1.4.0/jre/bin:/usr/qt/3/bin:/usr/kde/3/bin PERL5LIB=/usr/lib/site_perl/5.6.1 PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 21 years ago

From @andk

On 27 Nov 2002 09​:18​:54 -0000\, "edi@​agharta.de (via RT)" \perlbug@&#8203;perl\.org said​:

  > # New Ticket Created by edi@​agharta.de   > # Please include the string​: [perl #18708]   > # in the subject line of all future correspondence about this issue.   > # \<URL​: http​://rt.perl.org/rt2/Ticket/Display.html?id=18708 >

  > This is a bug report for perl from edi@​agharta.de\,   > generated with the help of perlbug 1.34 running under perl v5.8.0.

  > -----------------------------------------------------------------   > [Please enter your report here]

  > edi@​bird​:\~ > perl -e 'use Data​::Dumper; "a" =~ /((a)*)*/; print Dumper $1\, $2'   > $VAR1 = '';   > $VAR2 = undef;

Archaeological findings about this bug...

It was introduced to the trunk with patch 6373.

The bug was also integrated into 5.6.1 with patch 7772. (Note​: 7772 only compiles if 7799 is also integrated.)

Simply undoing the regexec.c part of that patch fixes the bug but also breaks test 860 in the test suite​:

not ok 860 () ^(a(b)?)+$​:aba​:y​:-$1-$2-​:-a-- => `-a-b-'\, match=1

The patch I tried was​:

#### DO NOT APPLY ####

Inline Patch ```diff --- perl-5.8.0@18217/regexec.c Fri Nov 29 21:38:04 2002 +++ perl-5.8.0@18217-ak/regexec.c Sun Dec 1 18:31:08 2002 @@ -293,8 +293,6 @@ PL_regstartp[paren] = HOPc(input, -1) - PL_bostr; \ PL_regendp[paren] = input - PL_bostr; \ } \ - else \ - PL_regendp[paren] = -1; \ } \ if (regmatch(next)) \ sayYES; \ ```

Hope that helps somebody else to find a solution, -- andreas

p5pRT commented 21 years ago

From @hvds

andreas.koenig@​anima.de (Andreas J. Koenig) wrote​: :>>>>> On 27 Nov 2002 09​:18​:54 -0000\, "edi@​agharta.de (via RT)" \perlbug@&#8203;perl\.org said​: : > edi@​bird​:\~ > perl -e 'use Data​::Dumper; "a" =~ /((a)*)*/; print Dumper $1\, $2' : > $VAR1 = ''; : > $VAR2 = undef; : :Archaeological findings about this bug... : :It was introduced to the trunk with patch 6373. : :The bug was also integrated into 5.6.1 with patch 7772. (Note​: 7772 :only compiles if 7799 is also integrated.)

Digging a bit further\, the actual patch was submitted (by me) in the discussion on bug #20000701.002​:   http​://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-07/msg00514.html

The difference between the two test cases is that for /((a)*)*/\, the inner paren gets optimised to CURLYN; for /(((a))*)*/ it stays as CURLYX. I suspect that there is something lacking from that patch for the CURLYN branch\, but I haven't yet got a fix.

Hugo

p5pRT commented 18 years ago

From eric.niebler@gmail.com

Created by eric.niebler@gmail.com

This is a bug report for perl from eric.niebler@​gmail.com\, generated with the help of perlbug 1.35 running under perl v5.8.7.

----------------------------------------------------------------- Consider the following program​:

  $str = 'aaA';   $str =~ /(((?​:a))?)+/i;   if(defined($2)) { print "$2"; }   else { print "not defined"; }

This prints "not defined\," and I think that's right. But if I change the regex to /(((a))?)+/i (that is\, if I change the third group from non-capturing to capturing)\, the program prints "A".

I can't think of a reason why changing group 3 from non-capturing to capturing should have any effect on whether group 2 captures anything. Seems like a regex bug to me.

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl v5.8.7: Configured by builder at Wed Nov 2 08:44:18 2005. Summary of my perl5 (revision 5 version 8 subversion 7) configuration: Platform: osname=MSWin32, osvers=5.0, archname=MSWin32-x86-multi-thread uname='' config_args='undef' hint=recommended, useposix=true, d_sigaction=undef usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=undef use64bitall=undef uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cl', ccflags ='-nologo -Gf -W3 -MD -Zi -DNDEBUG -O1 -DWIN32 -D_CONSOLE -DNO_STRICT -DHAVE_DES_FCRYPT -DNO_HASH_SEED -DUSE_SITECUSTOMIZE -DPERL_IMPLICIT_CONTEXT -DPERL_IMPLICIT_SYS -DUSE_PERLIO -DPERL_MSVCRT_READFIX', optimize='-MD -Zi -DNDEBUG -O1', cppflags='-DWIN32' ccversion='12.00.8804', gccversion='', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234 d_longlong=undef, longlongsize=8, d_longdbl=define, longdblsize=10 ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='__int64', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='link', ldflags ='-nologo -nodefaultlib -debug -opt:ref,icf -libpath:"C:\Perl\lib\CORE" -machine:x86' libpth=\lib libs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib perllibs= oldnames.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib netapi32.lib uuid.lib ws2_32.lib mpr.lib winmm.lib version.lib odbc32.lib odbccp32.lib msvcrt.lib libc=msvcrt.lib, so=dll, useshrplib=yes, libperl=perl58.lib gnulibc_version='undef' Dynamic Linking: dlsrc=dl_win32.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' ' cccdlflags=' ', lddlflags='-dll -nologo -nodefaultlib -debug -opt:ref,icf -libpath:"C:\Perl\lib\CORE" -machine:x86' Locally applied patches: ACTIVEPERL_LOCAL_PATCHES_ENTRY Iin_load_module moved for compatibility with build 806 Avoid signal flag SA_RESTART for older versions of HP-UX PerlEx support in CGI::Carp Less verbose ExtUtils::Install and Pod::Find instmodsh upgraded from ExtUtils-MakeMaker-6.25 Patch for CAN-2005-0448 from Debian with modifications Upgrade to Time-HiRes-1.76 25774 Keys of %INC always use forward slashes 25747 Accidental interpolation of $@ in Pod::Html 25362 File::Path::mkpath resets errno 25181 Incorrect (X)HTML generated by Pod::Html 24999 Avoid redefinition warning for MinGW 24699 ICMP_UNREACHABLE handling in Net::Ping 21540 Fix backward-compatibility issues in if.pm @INC for perl v5.8.7: C:/Perl/lib C:/Perl/site/lib . Environment for perl v5.8.7: HOME=C:\DOCUME~1\\ericne LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\IDE;C:\Program Files\Microsoft Visual Studio .NET 2003\VC7\BIN;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools\bin\prerelease;C:\Program Files\Microsoft Visual Studio .NET 2003\Common7\Tools\bin;C:\Program Files\Microsoft Visual Studio .NET 2003\SDK\v1.1\bin;C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322;C:\Program Files\libxml;C:\Perl\bin\;C:\Program Files\Windows Resource Kits\Tools\;C:\Python23\.;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\Program Files\Perforce;C:\Program Files\doxygen\bin;C:\Program Files\Debugging Tools for Windows;C:\WINDOWS\idw;C:\cygwin\home\ericne\boost\cvs\boost\tools\build\jam_src\bin.ntx86 PERL_BADLANG (unset) SHELL (unset) ```
p5pRT commented 18 years ago

From @hvds

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :Consider the following program​: : : $str = 'aaA'; : $str =~ /(((?​:a))?)+/i; : if(defined($2)) { print "$2"; } : else { print "not defined"; } : :This prints "not defined\," and I think that's right. :But if I change the regex to /(((a))?)+/i (that is\, if :I change the third group from non-capturing to capturing)\, :the program prints "A". : :I can't think of a reason why changing group 3 from :non-capturing to capturing should have any effect on :whether group 2 captures anything. Seems like a regex :bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

-Dr output shows that the two regexps are optimised differently​: with /(((?​:a))?)+/\, the $2 loop is optimised to CURLYN\, but with /(((a))?)+/ the interior is too complex for the optimisation to occur (which may itself be\, if not a bug\, an optimisation wart) so it remains as CURLYX. Presumably it is in the differing implementation of CURLYN and CURLYX that the difference arises\, but this isn't something I have time to look into right now.

The results may be reasonable nonetheless - we could in principle stick with "the $\ variables will contain the last thing successfully matched"\, while adding that "optional zero-length submatches (that don't affect success or failure of the match as a whole) may be elided by the optimiser". Which I suspect is what we're getting\, even though the evidence is that the less optimised variant is the one doing the eliding.

Hugo

p5pRT commented 18 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 18 years ago

From @abigail

On Tue\, Jan 03\, 2006 at 04​:27​:40AM +0000\, hv@​crypt.org wrote​:

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :Consider the following program​: : : $str = 'aaA'; : $str =~ /(((?​:a))?)+/i; : if(defined($2)) { print "$2"; } : else { print "not defined"; } : :This prints "not defined\," and I think that's right. :But if I change the regex to /(((a))?)+/i (that is\, if :I change the third group from non-capturing to capturing)\, :the program prints "A". : :I can't think of a reason why changing group 3 from :non-capturing to capturing should have any effect on :whether group 2 captures anything. Seems like a regex :bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

-Dr output shows that the two regexps are optimised differently​: with /(((?​:a))?)+/\, the $2 loop is optimised to CURLYN\, but with /(((a))?)+/ the interior is too complex for the optimisation to occur (which may itself be\, if not a bug\, an optimisation wart) so it remains as CURLYX. Presumably it is in the differing implementation of CURLYN and CURLYX that the difference arises\, but this isn't something I have time to look into right now.

The results may be reasonable nonetheless - we could in principle stick with "the $\ variables will contain the last thing successfully matched"\, while adding that "optional zero-length submatches (that don't affect success or failure of the match as a whole) may be elided by the optimiser". Which I suspect is what we're getting\, even though the evidence is that the less optimised variant is the one doing the eliding.

The following program suggests that in both regexes\, the outer set of parenthesis are matched four times​:

  #!/usr/bin/perl

  use strict;   use warnings;   no warnings 'syntax';

  $_ = 'aaA';   my ($i\, $j);   /(((?​:a))?(?{ $i ++; print "$i​: $2\n" }))+/i;   /(((a))?(?{ $j ++; print "$j​: $2\n" }))+/i;  
  __END__

  1​: a   2​: a   3​: A   Use of uninitialized value in concatenation (.) or string at (re_eval 1) line 1.   4​:   1​: a   2​: a   3​: A   4​: A

Abigail

p5pRT commented 18 years ago

From @ysth

On Tue\, Jan 03\, 2006 at 04​:27​:40AM +0000\, hv@​crypt.org wrote​:

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :Consider the following program​: : : $str = 'aaA'; : $str =~ /(((?​:a))?)+/i; : if(defined($2)) { print "$2"; } : else { print "not defined"; } : :This prints "not defined\," and I think that's right. :But if I change the regex to /(((a))?)+/i (that is\, if :I change the third group from non-capturing to capturing)\, :the program prints "A". : :I can't think of a reason why changing group 3 from :non-capturing to capturing should have any effect on :whether group 2 captures anything. Seems like a regex :bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

To me\, it seems clear that on the last iteration of the +\, the ? should match zero times\, so $2 would be "" with the ?​: and undefined without the ?​:.

p5pRT commented 18 years ago

From @abigail

On Tue\, Jan 03\, 2006 at 03​:59​:19PM -0800\, Yitzchak Scott-Thoennes wrote​:

On Tue\, Jan 03\, 2006 at 04​:27​:40AM +0000\, hv@​crypt.org wrote​:

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :Consider the following program​: : : $str = 'aaA'; : $str =~ /(((?​:a))?)+/i; : if(defined($2)) { print "$2"; } : else { print "not defined"; } : :This prints "not defined\," and I think that's right. :But if I change the regex to /(((a))?)+/i (that is\, if :I change the third group from non-capturing to capturing)\, :the program prints "A". : :I can't think of a reason why changing group 3 from :non-capturing to capturing should have any effect on :whether group 2 captures anything. Seems like a regex :bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

To me\, it seems clear that on the last iteration of the +\, the ? should match zero times\, so $2 would be "" with the ?​: and undefined without the ?​:.

That I don't understand. Since the ?​: controls whether or not there's a $3\, why should the value of $2 be different?

Abigail

p5pRT commented 18 years ago

From @ysth

On Wed\, Jan 04\, 2006 at 09​:48​:14AM +0100\, Abigail wrote​:

On Tue\, Jan 03\, 2006 at 03​:59​:19PM -0800\, Yitzchak Scott-Thoennes wrote​:

On Tue\, Jan 03\, 2006 at 04​:27​:40AM +0000\, hv@​crypt.org wrote​:

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :Consider the following program​: : : $str = 'aaA'; : $str =~ /(((?​:a))?)+/i; : if(defined($2)) { print "$2"; } : else { print "not defined"; } : :This prints "not defined\," and I think that's right. :But if I change the regex to /(((a))?)+/i (that is\, if :I change the third group from non-capturing to capturing)\, :the program prints "A". : :I can't think of a reason why changing group 3 from :non-capturing to capturing should have any effect on :whether group 2 captures anything. Seems like a regex :bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

To me\, it seems clear that on the last iteration of the +\, the ? should match zero times\, so $2 would be "" with the ?​: and undefined without the ?​:.

That I don't understand. Since the ?​: controls whether or not there's a $3\, why should the value of $2 be different?

Sorry\, I was somehow assigning numbers from the inside out instead of left to right. It should be undef in either case.

p5pRT commented 18 years ago

From eric.niebler@gmail.com

Yitzchak Scott-Thoennes wrote​:

On Wed\, Jan 04\, 2006 at 09​:48​:14AM +0100\, Abigail wrote​:

On Tue\, Jan 03\, 2006 at 03​:59​:19PM -0800\, Yitzchak Scott-Thoennes wrote​:

On Tue\, Jan 03\, 2006 at 04​:27​:40AM +0000\, hv@​crypt.org wrote​:

Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: ​:Consider the following program​: ​: ​: $str = 'aaA'; ​: $str =~ /(((?​:a))?)+/i; ​: if(defined($2)) { print "$2"; } ​: else { print "not defined"; } ​: ​:This prints "not defined\," and I think that's right. ​:But if I change the regex to /(((a))?)+/i (that is\, if ​:I change the third group from non-capturing to capturing)\, ​:the program prints "A". ​: ​:I can't think of a reason why changing group 3 from ​:non-capturing to capturing should have any effect on ​:whether group 2 captures anything. Seems like a regex ​:bug to me.

I agree the inconsistency smells like a bug\, though it isn't clear to me which variant exhibits it - both results seem reasonable in the absence of the other.

To me\, it seems clear that on the last iteration of the +\, the ? should match zero times\, so $2 would be "" with the ?​: and undefined without the ?​:.

That I don't understand. Since the ?​: controls whether or not there's a $3\, why should the value of $2 be different?

Sorry\, I was somehow assigning numbers from the inside out instead of left to right. It should be undef in either case.

There appears to be general agreement that this is a bug. But will it get fixed? What happens next? (Sorry\, I'm not familiar with this process.)

Eric

p5pRT commented 18 years ago

From @hvds

Eric Niebler \eric\.niebler@&#8203;gmail\.com wrote​: [...] :>>>>Eric Niebler (via RT) \perlbug\-followup@&#8203;perl\.org wrote​: :>>>>​:Consider the following program​: :>>>>​: :>>>>​: $str = 'aaA'; :>>>>​: $str =~ /(((?​:a))?)+/i; :>>>>​: if(defined($2)) { print "$2"; } :>>>>​: else { print "not defined"; } :>>>>​: :>>>>​:This prints "not defined\," and I think that's right. :>>>>​:But if I change the regex to /(((a))?)+/i (that is\, if :>>>>​:I change the third group from non-capturing to capturing)\, :>>>>​:the program prints "A". [...] :There appears to be general agreement that this is a bug. But will it :get fixed? What happens next? (Sorry\, I'm not familiar with this process.)

Now it waits until someone simultaneously acquires the time\, ability and desire to locate the bug; once located\, it may be found to be anything from easy to impossible to develop a fix that doesn't break anything else.

If a fix is developed it will go into the "bleeding edge" codebase first\, which is the one working towards v5.10 of perl; if it is stable there and does not appear to have a wider impact it will likely also be incorporated into the maintenance track used to deliver v5.8.x releases.

But there are few people with the knowledge to debug problems in the regexp engine\, and they tend to have limited time available\, so the first step may take a while.

Hugo

p5pRT commented 17 years ago

From @cpansprout

perl -MData​::Dumper -le '"aba" =~ /^(a(b)?)+$/; print Dumper $1\, $2;' $VAR1 = 'a'; $VAR2 = undef;

This is the case because the outer + makes the subexpression
containing the second pair of capturing parentheses match twice. The
second time through\, (b) does not participate in the match\, so $2 is
undef (this coincides with ECMAScript's behaviour).

But if I change (b) to (b+) or ((b))\, the behaviour changes​:

perl -MData​::Dumper -le '"aba" =~ /^(a(b+)?)+$/; print Dumper $1\, $2;' $VAR1 = 'a'; $VAR2 = 'b';

perl -MData​::Dumper -le '"aba" =~ /^(a((b))?)+$/; print Dumper $1\, $2;' $VAR1 = 'a'; $VAR2 = 'b';

(Though this probably makes no difference\, if this is to be made
consistent\, I think I prefer the former behaviour [!defined $2]).

This is the case both with 5.8.8 and 5.9.5 #31441.

$s = "Juusstt aannootthheerr Peerrll hhaacckkeerr\,\n"; $s =~ s/(?​:((?\<!$_)$_)?){2}(?​:((?\<!$_$_)$_+)?){2}/$1$2/g for 'a' .. 'z'; print $s;


Flags​:   category=   severity=


Site configuration information for perl v5.8.8​:

Configured by neo at Tue Jan 9 16​:06​:53 PST 2007.

Summary of my perl5 (revision 5 version 8 subversion 8) configuration​:   Platform​:   osname=darwin\, osvers=8.8.0\, archname=darwin-thread-multi-2level   uname='darwin treebeard.local 8.8.0 darwin kernel version 8.8.0​:
fri sep 8 17​:18​:57 pdt 2006; root​:xnu-792.12.6.obj~1release_ppc power
macintosh powerpc '   config_args=''   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define   useperlio=define d_sfio=undef uselargefiles=define usesocks=undef   use64bitint=undef use64bitall=undef uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='cc'\, ccflags ='-g -pipe -fno-common -DPERL_DARWIN -no-cpp- precomp -fno-strict-aliasing -I/usr/local/include'\,   optimize='-O3'\,   cppflags='-no-cpp-precomp -g -pipe -fno-common -DPERL_DARWIN -no- cpp-precomp -fno-strict-aliasing -I/usr/local/include'   ccversion=''\, gccversion='4.0.0 20041026 (Apple Computer\, Inc.
build 4061)'\, gccosandvers='darwin8'   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=4321   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\,
Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='env MACOSX_DEPLOYMENT_TARGET=10.3 cc'\, ldflags =' -L/usr/ local/lib'   libpth=/usr/local/lib /usr/lib   libs=-ldbm -ldl -lm -lc   perllibs=-ldl -lm -lc   libc=\, so=dylib\, useshrplib=false\, libperl=libperl.a   gnulibc_version=''   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=bundle\, d_dlsymun=undef\, ccdlflags=' '   cccdlflags=' '\, lddlflags=' -bundle -undefined dynamic_lookup -L/ usr/local/lib'

Locally applied patches​:


@​INC for perl v5.8.8​:   /usr/local/lib/perl5/5.8.8/darwin-thread-multi-2level   /usr/local/lib/perl5/5.8.8   /usr/local/lib/perl5/site_perl/5.8.8/darwin-thread-multi-2level   /usr/local/lib/perl5/site_perl/5.8.8   /usr/local/lib/perl5/site_perl   /System/Library/Perl/5.8.6/darwin-thread-multi-2level   /System/Library/Perl/5.8.6/darwin-thread-multi-2level   /System/Library/Perl/5.8.6   /Library/Perl/5.8.6/darwin-thread-multi-2level   /Library/Perl/5.8.6/darwin-thread-multi-2level   /Library/Perl/5.8.6   /Library/Perl   /Network/Library/Perl/5.8.6/darwin-thread-multi-2level   /Network/Library/Perl/5.8.6   /Network/Library/Perl   /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level   /System/Library/Perl/Extras/5.8.6/darwin-thread-multi-2level   /System/Library/Perl/Extras/5.8.6   /Library/Perl/5.8.1   .


Environment for perl v5.8.8​:   DYLD_LIBRARY_PATH (unset)   HOME=/Users/neo   LANG (unset)   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=/bin​:/sbin​:/usr/bin​:/usr/sbin​:/usr/TeX/bin/powerpc- darwin6.8​:/usr/local/bin   PERL_BADLANG (unset)   SHELL=/bin/bash

p5pRT commented 16 years ago

From p5p@spam.wizbit.be

Attached is a patch with a todo test for this bug report.

Summary of the report​:

#!/usr/bin/perl -l

if ("A" =~ /(((?​:A))?)+/) {   print "\$1 = $1\, \$2 = $2\, \$3 = $3" }

if ("A" =~ /(((A))?)+/) {   print "\$1 = $1\, \$2 = $2\, \$3 = $3"; } __END__ Output​:

$1 = \, $2 = \, $3 = $1 = \, $2 = A\, $3 = A

The value of the second capture group depends on wheter or not there is a third capturing group.

The value should be the same in both cases.

(For more info look at RT)

p5pRT commented 16 years ago

From p5p@spam.wizbit.be

Inline Patch ```diff --- old/t/op/pat.t 2008-05-24 23:15:39.000000000 +0200 +++ new/t/op/pat.t 2008-05-24 23:16:15.000000000 +0200 @@ -4642,6 +4642,17 @@ iseq( join('', @isPunctLatin1), '', 'IsPunct agrees with [:punct:] with explicit Latin1'); } +{ + local $TODO = "[perl #38133]"; + + "A" =~ /(((?:A))?)+/; + my $first = $2; + + "A" =~ /(((A))?)+/; + my $second = $2; + + iseq($first, $second); +} # Test counter is at bottom of file. Put new tests above here. @@ -4705,7 +4716,7 @@ # Don't forget to update this! BEGIN { - $::TestCount = 4035; + $::TestCount = 4036; print "1..$::TestCount\n"; } ```
p5pRT commented 16 years ago

From [Unknown Contact. See original ticket]

Attached is a patch with a todo test for this bug report.

Summary of the report​:

#!/usr/bin/perl -l

if ("A" =~ /(((?​:A))?)+/) {   print "\$1 = $1\, \$2 = $2\, \$3 = $3" }

if ("A" =~ /(((A))?)+/) {   print "\$1 = $1\, \$2 = $2\, \$3 = $3"; } __END__ Output​:

$1 = \, $2 = \, $3 = $1 = \, $2 = A\, $3 = A

The value of the second capture group depends on wheter or not there is a third capturing group.

The value should be the same in both cases.

(For more info look at RT)

p5pRT commented 13 years ago

From @khwilliamson

Commit 72aa120d9a32a14196c9e39aa26993909423f096 adds the attached todo .t patch to re/pat.t --Karl Williamson

demerphq commented 1 year ago

see also https://github.com/Perl/perl5/issues/19615

demerphq commented 1 year ago

This is fixed in #20677