Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 556 forks source link

Quantifiers in (?(DEFINE)...) #16106

Open p5pRT opened 7 years ago

p5pRT commented 7 years ago

Migrated from rt.perl.org#131868 (status was 'open')

Searchable as RT131868$

p5pRT commented 7 years ago

From @abigail

Created by @abigail

Consider this pattern​:

  my $pat = qr {   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit)+)   )   ^(?&digits)$   }x;

This matches a string of digits\, and Perl doesn't complain.

Now\, let's make a small change; instead of trying to match any number of digits\, lets match 4​:

  my $pat = qr {   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit)+)   )   ^(?&digits)$   }x;

This pattern works\, but a warning is issued​:

  Quantifier unexpected on zero-length expression in regex m/   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit){4})   )   ^(?&digits)$   / at /tmp/bar line 15.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.26.0: Configured by abigail at Wed Jun 7 23:04:17 CEST 2017. Summary of my perl5 (revision 5 version 26 subversion 0) configuration: Platform: osname=darwin osvers=15.6.0 archname=darwin-ld-2level uname='darwin athena 15.6.0 darwin kernel version 15.6.0: thu jun 23 18:25:34 pdt 2016; root:xnu-3248.60.10~1release_x86_64 x86_64 ' config_args='-des -Uversiononly -Dperladmin=abigail@abigail.be -Dcf_email=abigail@abigail.be -Dmydomain=abigail.be -Dcc=gcc -Dprefix=/opt/perl/5.26.0 -Dusedevel -Dusemorebits' hint=recommended useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=define use64bitall=define uselongdouble=define usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='gcc' ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include -DPERL_USE_SAFE_PUTENV' optimize='-O3' cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include' ccversion='' gccversion='4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='long double' nvsize=16 Off_t='off_t' lseeksize=8 alignbytes=16 prototype=define Linker and Libraries: ld='gcc' ldflags =' -mmacosx-version-min=10.11 -fstack-protector-strong -L/opt/local/lib' libpth=/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/lib /opt/local/lib /usr/lib libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc perllibs=-lpthread -ldl -lm -lutil -lc libc= so=dylib useshrplib=false libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=bundle d_dlsymun=undef ccdlflags=' ' cccdlflags=' ' lddlflags=' -mmacosx-version-min=10.11 -bundle -undefined dynamic_lookup -L/opt/local/lib -fstack-protector-strong' @INC for perl 5.26.0: /Users/abigail/Perl/CPAN/Regexp-Common2/lib /Users/abigail/Perl/CPAN/Test-Regexp/lib /opt/perl/5.26.0/lib/site_perl/5.26.0/darwin-ld-2level /opt/perl/5.26.0/lib/site_perl/5.26.0 /opt/perl/5.26.0/lib/5.26.0/darwin-ld-2level /opt/perl/5.26.0/lib/5.26.0 Environment for perl 5.26.0: DYLD_LIBRARY_PATH (unset) HOME=/Users/abigail LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/Users/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib LOGDIR (unset) PATH=/Users/abigail/Bin:/opt/perl/bin:/opt/local/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/opt/git/bin:/Users/abigail/Perl/Photos:/Users/abigail/Perl/Bin:/opt/mysql/bin:/opt/local/bin:/Users/abigail/bin PERL5LIB=/Users/abigail/Perl/CPAN/Regexp-Common2/lib:/Users/abigail/Perl/CPAN/Test-Regexp/lib PERLDIR=/opt/perl PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 7 years ago

From @demerphq

Created by @abigail

Consider this pattern​:

  my $pat = qr {   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit)+)   )   ^(?&digits)$   }x;

This matches a string of digits\, and Perl doesn't complain.

Now\, let's make a small change; instead of trying to match any number of digits\, lets match 4​:

  my $pat = qr {   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit)+)   )   ^(?&digits)$   }x;

This pattern works\, but a warning is issued​:

  Quantifier unexpected on zero-length expression in regex m/   (?(DEFINE)   (?\ [0-9])   (?\ (?&digit){4})   )   ^(?&digits)$   / at /tmp/bar line 15.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

Yves

Perl Info ``` Flags: category=core severity=medium Site configuration information for perl 5.26.0: Configured by abigail at Wed Jun 7 23:04:17 CEST 2017. Summary of my perl5 (revision 5 version 26 subversion 0) configuration: Platform: osname=darwin osvers=15.6.0 archname=darwin-ld-2level uname='darwin athena 15.6.0 darwin kernel version 15.6.0: thu jun 23 18:25:34 pdt 2016; root:xnu-3248.60.10~1release_x86_64 x86_64 ' config_args='-des -Uversiononly -Dperladmin=abigail@abigail.be -Dcf_email=abigail@abigail.be -Dmydomain=abigail.be -Dcc=gcc -Dprefix=/opt/perl/5.26.0 -Dusedevel -Dusemorebits' hint=recommended useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=define use64bitall=define uselongdouble=define usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='gcc' ccflags ='-fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include -DPERL_USE_SAFE_PUTENV' optimize='-O3' cppflags='-no-cpp-precomp -fno-common -DPERL_DARWIN -no-cpp-precomp -mmacosx-version-min=10.11 -fno-strict-aliasing -pipe -fstack-protector-strong -I/opt/local/include' ccversion='' gccversion='4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='long double' nvsize=16 Off_t='off_t' lseeksize=8 alignbytes=16 prototype=define Linker and Libraries: ld='gcc' ldflags =' -mmacosx-version-min=10.11 -fstack-protector-strong -L/opt/local/lib' libpth=/Applications/Xcode.app/Contents/Developer/ Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/ XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/ Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.12.sdk/usr/lib /opt/local/lib /usr/lib libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc perllibs=-lpthread -ldl -lm -lutil -lc libc= so=dylib useshrplib=false libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=bundle d_dlsymun=undef ccdlflags=' ' cccdlflags=' ' lddlflags=' -mmacosx-version-min=10.11 -bundle -undefined dynamic_lookup -L/opt/local/lib -fstack-protector-strong' @INC for perl 5.26.0: /Users/abigail/Perl/CPAN/Regexp-Common2/lib /Users/abigail/Perl/CPAN/Test-Regexp/lib /opt/perl/5.26.0/lib/site_perl/5.26.0/darwin-ld-2level /opt/perl/5.26.0/lib/site_perl/5.26.0 /opt/perl/5.26.0/lib/5.26.0/darwin-ld-2level /opt/perl/5.26.0/lib/5.26.0 Environment for perl 5.26.0: DYLD_LIBRARY_PATH (unset) HOME=/Users/abigail LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH=/Users/abigail/Lib:/usr/local/lib:/ usr/lib:/lib:/usr/X11R6/lib LOGDIR (unset) PATH=/Users/abigail/Bin:/opt/perl/bin:/opt/local/bin:/usr/ local/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/ usr/X11R6/bin:/usr/games:/opt/git/bin:/Users/abigail/Perl/ Photos:/Users/abigail/Perl/Bin:/opt/mysql/bin:/opt/local/ bin:/Users/abigail/bin PERL5LIB=/Users/abigail/Perl/CPAN/Regexp-Common2/lib:/ Users/abigail/Perl/CPAN/Test-Regexp/lib PERLDIR=/opt/perl PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 7 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 7 years ago

From @abigail

On Sat\, Aug 19\, 2017 at 02​:58​:23PM -0700\, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am\, "Abigail" \perlbug\-followup@​perl\.org wrote​:

# New Ticket Created by Abigail # Please include the string​: [perl #131868] # in the subject line of all future correspondence about this issue. # \<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=131868 >

This is a bug report for perl from abigail@​abigail.be\, generated with the help of perlbug 1.40 running under perl 5.26.0.

----------------------------------------------------------------- [Please describe your issue here]

Consider this pattern​:

my $pat = qr \{
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\+\)
    \)
    ^\(?&digits\)$
\}x;

This matches a string of digits\, and Perl doesn't complain.

Now\, let's make a small change; instead of trying to match any number of digits\, lets match 4​:

my $pat = qr \{
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\+\)
    \)
    ^\(?&digits\)$
\}x;

This pattern works\, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

It warns for me in 5.22.0\, but not in 5.20.0

Abigail

p5pRT commented 7 years ago

From @iabyn

On Sun\, Aug 20\, 2017 at 10​:16​:16PM +0200\, Abigail wrote​:

On Sat\, Aug 19\, 2017 at 02​:58​:23PM -0700\, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am\, "Abigail" \perlbug\-followup@&#8203;perl\.org wrote​: This pattern works\, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

It warns for me in 5.22.0\, but not in 5.20.0

Bisects to​:

a51d618a82a7057c3aabb600a7a8691d27f44a34 is the first bad commit commit a51d618a82a7057c3aabb600a7a8691d27f44a34 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Sep 19 19​:57​:34 2014 +0200

  rt 122283 - do not recurse into GOSUB/GOSTART when not SCF_DO_SUBSTR  
  See also comments in patch. A complex regex "grammar" like that in   RT 122283 causes perl to take literally forever\, and exhaust all   memory during the pattern optimization phase.  
  Unfortunately I could not track down exacty why this occured\, but   it was very clear that the excessive recursion was unnecessary and   excessive. By simply eliminating the unncessary recursion performance   goes back to being acceptable.  
  I have not thought of a good way to test this change\, so this patch   does not include any tests. Perhaps we can test it using alarm\, but   I will follow up on that later.

-- The crew of the Enterprise encounter an alien life form which is surprisingly neither humanoid nor made from pure energy.   -- Things That Never Happen in "Star Trek" #22

p5pRT commented 7 years ago

From @demerphq

On 11 September 2017 at 09​:17\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Sun\, Aug 20\, 2017 at 10​:16​:16PM +0200\, Abigail wrote​:

On Sat\, Aug 19\, 2017 at 02​:58​:23PM -0700\, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am\, "Abigail" \perlbug\-followup@&#8203;perl\.org wrote​: This pattern works\, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

It warns for me in 5.22.0\, but not in 5.20.0

Bisects to​:

a51d618a82a7057c3aabb600a7a8691d27f44a34 is the first bad commit commit a51d618a82a7057c3aabb600a7a8691d27f44a34 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not SCF\_DO\_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion performance
goes back to being acceptable\.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f4440d849cf8fca676f87e574164e33cf2e13

Thanks for the report. Feel like writing a test? ;-)

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 7 years ago

From @abigail

On Wed\, Sep 13\, 2017 at 06​:05​:43PM +0200\, demerphq wrote​:

On 11 September 2017 at 09​:17\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Sun\, Aug 20\, 2017 at 10​:16​:16PM +0200\, Abigail wrote​:

On Sat\, Aug 19\, 2017 at 02​:58​:23PM -0700\, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am\, "Abigail" \perlbug\-followup@&#8203;perl\.org wrote​: This pattern works\, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

It warns for me in 5.22.0\, but not in 5.20.0

Bisects to​:

a51d618a82a7057c3aabb600a7a8691d27f44a34 is the first bad commit commit a51d618a82a7057c3aabb600a7a8691d27f44a34 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not SCF\_DO\_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion performance
goes back to being acceptable\.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f4440d849cf8fca676f87e574164e33cf2e13

Thanks for the report. Feel like writing a test? ;-)

Thanks for the patch. Test added in commit c2b4244a1cd5bb89e0df552475efbb59ea37e706.

Abigail

p5pRT commented 7 years ago

From @demerphq

Thank you! Cheers\, yves

On 14 Sep 2017 00​:21\, "Abigail" \abigail@&#8203;abigail\.be wrote​:

On Wed\, Sep 13\, 2017 at 06​:05​:43PM +0200\, demerphq wrote​:

On 11 September 2017 at 09​:17\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Sun\, Aug 20\, 2017 at 10​:16​:16PM +0200\, Abigail wrote​:

On Sat\, Aug 19\, 2017 at 02​:58​:23PM -0700\, yves orton via RT wrote​:

On 9 Aug 2017 7​:39 am\, "Abigail" \perlbug\-followup@&#8203;perl\.org wrote​: This pattern works\, but a warning is issued​:

Quantifier unexpected on zero\-length expression in regex m/
    \(?\(DEFINE\)
      \(?\<digit>   \[0\-9\]\)
      \(?\<digits>  \(?&digit\)\{4\}\)
    \)
    ^\(?&digits\)$
/ at /tmp/bar line 15\.

The same happens if we replace {4} with {4\,4} or {2\,4}\, and disappears when it's replaced with {4\,}. That is\, a warning happens if the number of repeats is limited. Replacing "(?&digit)" with "[0-9]" makes the warning disappear as well.

This problem is still present in blead.

It would be helpful to know if this is new behaviour or not. Something makes me think it's newish at least.

It warns for me in 5.22.0\, but not in 5.20.0

Bisects to​:

a51d618a82a7057c3aabb600a7a8691d27f44a34 is the first bad commit commit a51d618a82a7057c3aabb600a7a8691d27f44a34 Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Sep 19 19​:57​:34 2014 +0200

rt 122283 \- do not recurse into GOSUB/GOSTART when not

SCF_DO_SUBSTR

See also comments in patch\. A complex regex "grammar" like that in
RT 122283 causes perl to take literally forever\, and exhaust all
memory during the pattern optimization phase\.

Unfortunately I could not track down exacty why this occured\, but
it was very clear that the excessive recursion was unnecessary and
excessive\. By simply eliminating the unncessary recursion

performance goes back to being acceptable.

I have not thought of a good way to test this change\, so this patch
does not include any tests\. Perhaps we can test it using alarm\, but
I will follow up on that later\.

Fixed in 0e3f4440d849cf8fca676f87e574164e33cf2e13

Thanks for the report. Feel like writing a test? ;-)

Thanks for the patch. Test added in commit c2b4244a1cd5bb89e0df552475efbb59ea37e706.

Abigail