Perl / perl5

đŸȘ The Perl programming language
https://dev.perl.org/perl5/
Other
1.94k stars 554 forks source link

Missing warning for /[[:digit]]/ #5305

Closed p5pRT closed 8 years ago

p5pRT commented 22 years ago

Migrated from rt.perl.org#8904 (status was 'resolved')

Searchable as RT8904$

p5pRT commented 22 years ago

From @abigail

Created by @abigail

  $ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/'   $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/. Since this is highly unlikely the programmer really wants this\, I think this should generate at least a warning.

Abigail

Perl Info ``` Flags: category=core severity=low Site configuration information for perl v5.7.3: Configured by abigail at Tue Apr 2 20:17:38 CEST 2002. Summary of my perl5 (revision 5.0 version 7 subversion 3 patch 15679) configuration: Platform: osname=linux, osvers=2.4.5, archname=i686-linux-64int-ld uname='linux hermione 2.4.5 #6 fri jun 22 01:38:20 pdt 2001 i686 unknown ' config_args='-des -Dusedevel -Dprefix=/opt/bleadperl -Dusemorebits -Doptimize=-g -Duse64bitall -Uversiononly -Dmydomain=.foad.org -Dcf_email=abigail@foad.org -Dperladmin=abigail@foad.org' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=define usemymalloc=n, bincompat5005=define Compiler: cc='cc', ccflags ='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/opt/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-g', cppflags='-DDEBUGGING -fno-strict-aliasing -I/usr/local/include -I/opt/local/include' ccversion='', gccversion='2.95.3 20010315 (release)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib -L/opt/local/lib' libpth=/usr/local/lib /opt/local/lib /lib /usr/lib libs=-lnsl -lndbm -lgdbm -ldl -lm -lc -lcrypt -lutil perllibs=-lnsl -ldl -lm -lc -lcrypt -lutil libc=/lib/libc-2.2.3.so, so=so, useshrplib=false, libperl=libperl.a Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib -L/opt/local/lib' Locally applied patches: DEVEL15661 @INC for perl v5.7.3: /home/abigail/Perl /home/abigail/Sybase /opt/bleadperl/lib/5.7.3/i686-linux-64int-ld /opt/bleadperl/lib/5.7.3 /opt/bleadperl/lib/site_perl/5.7.3/i686-linux-64int-ld /opt/bleadperl/lib/site_perl/5.7.3 /opt/bleadperl/lib/site_perl/5.7.2/i686-linux-64int-ld /opt/bleadperl/lib/site_perl/5.7.2 /opt/bleadperl/lib/site_perl . Environment for perl v5.7.3: HOME=/home/abigail LANG (unset) LANGUAGE (unset) LC_ALL=POSIX LD_LIBRARY_PATH=/home/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib:/opt/gnome/lib LOGDIR (unset) PATH=/home/abigail/Bin:/opt/perl/bin:/usr/local/bin:/usr/local/X11/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/opt/povray/bin:/opt/gnome/bin:/opt/opera/bin:/usr/share/texmf/bin:/opt/Acrobat4/bin:/opt/java/blackdown/j2sdk1.3.1/bin:/usr/local/games/bin:/opt/gnuplot/bin:/opt/mysql/bin PERL5LIB=/home/abigail/Perl:/home/abigail/Sybase PERLDIR=/opt/perl PERL_BADLANG (unset) SHELL=/usr/bin/bash ```
p5pRT commented 21 years ago

From @jhi

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/' $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/. Since this is highly unlikely the programmer really wants this\, I think this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit code (and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'"\, and figuring out here what the user "intended" to write seems to me be akin to AI\, or lacking that\, fuzzy matching against the known POSIX character class names. Note that there maybe more stuff after the first character class\, e.g. /[[​:]foobar[]]/\, so parsing this syntax is definitely unfun.

I think someone needs to sit down\, and go through all the possible combinations and permutations\, and decide which ones mean what\, and which ones are syntax errors\, and only then we can wrote robust parsing code for it.

-- Jarkko Hietaniemi \jhi@​iki\.fi http​://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen

p5pRT commented 12 years ago

From @jkeenan

On Thu Nov 28 14​:03​:37 2002\, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/' $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/. Since this is highly unlikely the programmer really wants this\, I think this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit code (and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'"\, and figuring out here what the user "intended" to write seems to me be akin to AI\, or lacking that\, fuzzy matching against the known POSIX character class names. Note that there maybe more stuff after the first character class\, e.g. /[[​:]foobar[]]/\, so parsing this syntax is definitely unfun.

I think someone needs to sit down\, and go through all the possible combinations and permutations\, and decide which ones mean what\, and which ones are syntax errors\, and only then we can wrote robust parsing code for it.

Abigail\, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still an issue here that warrants keeping the RT open?

Thank you very much. Jim Keenan

p5pRT commented 11 years ago

From @jkeenan

On Sun Oct 21 16​:17​:30 2012\, jkeenan wrote​:

On Thu Nov 28 14​:03​:37 2002\, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/' $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/. Since this is highly unlikely the programmer really wants this\, I think this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit code (and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'"\, and figuring out here what the user "intended" to write seems to me be akin to AI\, or lacking that\, fuzzy matching against the known POSIX character class names. Note that there maybe more stuff after the first character class\, e.g. /[[​:]foobar[]]/\, so parsing this syntax is definitely unfun.

I think someone needs to sit down\, and go through all the possible combinations and permutations\, and decide which ones mean what\, and which ones are syntax errors\, and only then we can wrote robust parsing code for it.

Abigail\, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still an issue here that warrants keeping the RT open?

Thank you very much. Jim Keenan

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and* is willing to take on an investigation of the issues discussed. (I will take the ticket to remind myself to close it.)

Thank you very much. Jim Keenan

p5pRT commented 11 years ago

From @jhi

On Monday-201301-14 19​:39\, James E Keenan via RT wrote​:

On Sun Oct 21 16​:17​:30 2012\, jkeenan wrote​:

On Thu Nov 28 14​:03​:37 2002\, jhi wrote​:

To satisfy Robert's jugu^Wregular expression :-)

8904​: Missing warning for /[[​:digit]]/.

$ /opt/bleadperl/bin/perl -wle 'print "Found" if "9" =~ /[[​:digit]]/' $

The missing '​:' causes Perl to think you want the equivalent of /[[​:dgit]/. Since this is highly unlikely the programmer really wants this\, I think this should generate at least a warning.

I'm uncertain of how to proceed with this since there's explicit code (and tests) to "grandfather" /[[​:]/ to mean "'[' or '​:'"\, and figuring out here what the user "intended" to write seems to me be akin to AI\, or lacking that\, fuzzy matching against the known POSIX character class names. Note that there maybe more stuff after the first character class\, e.g. /[[​:]foobar[]]/\, so parsing this syntax is definitely unfun.

I think someone needs to sit down\, and go through all the possible combinations and permutations\, and decide which ones mean what\, and which ones are syntax errors\, and only then we can wrote robust parsing code for it.

Abigail\, Jarkko​:

This ticket was last commented on nearly ten years ago. Is there still an issue here that warrants keeping the RT open?

Thank you very much. Jim Keenan

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and* is willing to take on an investigation of the issues discussed. (I will take the ticket to remind myself to close it.)

I CCed Karl since he's pretty much the Regex Imperator now.

Thank you very much. Jim Keenan

--- via perlbug​: queue​: perl5 status​: open https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904

p5pRT commented 11 years ago

From @bulk88

On Mon Jan 14 16​:39​:24 2013\, jkeenan wrote​:

We've heard nothing about this ticket since my last post in October.

I propose we close this ticket in 14 days unless someone objects *and* is willing to take on an investigation of the issues discussed. (I will take the ticket to remind myself to close it.)

Thank you very much. Jim Keenan

The warning sounds useful from a superficial read. I am not going to investigate it. I have no opinion on closing it.

-- bulk88 ~ bulk88 at hotmail.com

p5pRT commented 11 years ago

From @khwilliamson

I thought I had commented on this. I think it would have been fixed if porters had thought there wasn't a back compat issue.

But the new (?[ ]) experimental syntax already tests for this particular issue. My take is to tell people that this is coming\, and is available experimentally now. I don't think it's going to go in plain []\, but they can start writing their classes using the new syntax and get this check.

p5pRT commented 11 years ago

From @jkeenan

On Mon Jan 14 20​:47​:08 2013\, khw wrote​:

I thought I had commented on this. I think it would have been fixed if porters had thought there wasn't a back compat issue.

But the new (?[ ]) experimental syntax already tests for this particular issue. My take is to tell people that this is coming\, and is available experimentally now. I don't think it's going to go in plain []\, but they can start writing their classes using the new syntax and get this check.

Well\, since there are still a variety of opinions being expressed\, I don't feel qualified to be the closer of the ticket. Un-taking it.

p5pRT commented 10 years ago

From @khwilliamson

As mentioned in the messages above\, we are unlikely to add this message that might break long-standing behavior for regular bracketed character classes. However\, the extended bracketed classes (?[ ]) already do the requested check. This is an experimental feature\, which may become non-experimental as soon as 5.22. Hence\, I am marking this as stalled and adding to the blockers for 5.22. If the feature remains experimental\, this ticket can be moved to a later release. Otherwise\, we close the ticket when 5.22 ships\, as the core will then have the requested behavior available -- Karl Williamson

p5pRT commented 10 years ago

@khwilliamson - Status changed from 'open' to 'stalled'

p5pRT commented 9 years ago

From @rjbs

I have un-blocker-ed this.

It is true that a facility that improves on the state of things is in 5.22\, even if it is now still experimental. On the other hand\, I *do* think this is a place to consider adding a new warning. I know several people who were bitten by this mistake in the last year alone. Obviously\, this is pure anecdote\, but we should consider whether there is a pattern on which we'd warn\, to improve the "unknown posix class" warning condition.

I don't know how we go about parsing this\, but if [[​:askii​:]] warns\, then why not [[​:ascii​::]] or [[​:ascii]]? They seem like obvious mistakes.

Obviously\, nothing here should change for 5.22!

-- rjbs

p5pRT commented 9 years ago

The RT System itself - Status changed from 'stalled' to 'open'

p5pRT commented 9 years ago

From @khwilliamson

I plan to address this and similar in 5.23

-- Karl Williamson

p5pRT commented 9 years ago

From @khwilliamson

On 09/14/2015 01​:02 PM\, Karl Williamson wrote​:

Can you tell what the attached pattern would match without trying it? The answer and commentary will be posted tomorrow or so.

Several people were on the right track\, but no one knew that the comment itself affected what gets compiled.

I discovered this while working on https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904 "Missing warning for /[[​:digit]]/"

The problem is that [[​:posix​:]] classes have a very strict syntax\, any deviation from which is silently interpreted as something else. Abigail asked for a warning in this ticket from 13 years ago. Abigail usually does not like new warnings.

I thought we'd agreed to add the warning to be in effect only if use re 'strict' is in effect\, but the discovery of the case presented in the puzzle has made me think this needs more discussion.

The reason the hint was somewhat misleading is that the hint itself (in a comment no less) changes the behavior of the perl interpreter. If you change the '[=]' in the hint to '(=)'\, you get very different behavior.

There are actually three kinds of posix classes\, denoted​: [.posix.] [=posix=] and [​:posix​:]. perl only implements the last\, but looks for the first two in order to warn that they aren't supported.

The '[=' in the line

[abc[=\,+*;def]

looks to perl like it could contain the start of a [=posix=] class\, and it starts looking for the termination. It looks ahead in the pattern as far as it takes to find one\, and in the puzzle example\, it finds what appears to be that termination an arbitrary number of lines down\, and in this case\, in a comment. It ignores the '[' in the comment\, but sees the '=]' and thinks that is the termination of the '[=' many lines before. It then looks for the termination of the first '[' in

[abc[=\,+*;def]

but there isn't one\, so it croaks with

Unmatched [ in regex; marked by \<-- HERE in m/ # This is a pattern wth a bracketed character class   [ \<-- HERE abc[=\,+*;def]

On the other hand if there is no '=]' before the end of the pattern\, it thinks that no [=posix=] class was intended\, and the first ']' in the pattern terminates the character class\, and the whole thing compiles (without a warning) into​:

Final program​:   1​: ANYOFD[*+\,;=[a-f] (12)   12​: EXACT \ (14)   14​: END (0)

Comments should not affect the behavior of code. So at a minimum this should be fixed. But what else qualifies for cleaning up this mess without someone having to 'use re "strict"' ?

p5pRT commented 9 years ago

From @ikegami

On Tue\, Sep 15\, 2015 at 9​:52 AM\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

Comments should not affect the behavior of code.

You cannot place comments in a character class. Therefore\, there are no comments in that code.

The problem is that the meaning of `[=` changes based on whether `=]` is found or not. We've recently made changes to the re engine that enforces the escaping of characters such as `{` where it was treated literally before\, and it seems that's that the appropriate fix here too. That or drop support for [= =] and [. .] completely.

p5pRT commented 9 years ago

From perl5-porters@perl.org

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this should be fixed. But what else qualifies for cleaning up this mess without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually support line breaks in POSIX char classes in any meaningful way.

$ perl5.20.2 -Mre=debug -e '/[[​:alpha​:]]/' Compiling REx "[[​:alpha​:]]" Final program​:   1​: POSIXD[​:alpha​:] (2)   2​: END (0) stclass POSIXD[​:alpha​:] minlen 1 Freeing REx​: "[[​:alpha​:]]" $ perl5.20.2 -Mre=debug -e '/[[​:alpha' -e '​:]]/' Compiling REx "[[​:alpha%n​:]]" POSIX class [​:alpha :] unknown in regex; marked by \<-- HERE in m/[[​:alpha :] \<-- HERE ]/ at -e line 2.

p5pRT commented 9 years ago

From @ikegami

On Tue\, Sep 15\, 2015 at 12​:19 PM\, Father Chrysostomos \sprout@&#8203;cpan\.org wrote​:

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this should be fixed. But what else qualifies for cleaning up this mess without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually support line breaks in POSIX char classes in any meaningful way.

The problem has nothing to do with newlines or /x.

$ perl -e'qr/[abc[=]/'

$ perl -e'qr/[abc[=] =]/' Unmatched [ in regex; marked by \<-- HERE in m/[ \<-- HERE abc[=] =]/ at -e line 1.

p5pRT commented 9 years ago

From @ikegami

On Wed\, Sep 16\, 2015 at 12​:26 PM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Tue\, Sep 15\, 2015 at 12​:19 PM\, Father Chrysostomos \sprout@&#8203;cpan\.org wrote​:

Karl Williamson

Comments should not affect the behavior of code. So at a minimum this should be fixed. But what else qualifies for cleaning up this mess without someone having to 'use re "strict"' ?

The search for =] should be restricted to one line. We don't actually support line breaks in POSIX char classes in any meaningful way.

The problem has nothing to do with newlines or /x.

$ perl -e'qr/[abc[=]/'

$ perl -e'qr/[abc[=] =]/' Unmatched [ in regex; marked by \<-- HERE in m/[ \<-- HERE abc[=] =]/ at -e line 1.

Better yet\, the following demonstrates how concatenation two valid regex patterns producing an invalid one​:

$ perl -e'   $re = qr/[abc[=]/ . qr/ =]/;   qr/$re/; ' Unmatched [ in regex; marked by \<-- HERE in m/(?^u​:[ \<-- HERE abc[=])(?^u​: =])/ at -e line 3.

p5pRT commented 9 years ago

From zefram@fysh.org

Eric Brine wrote​:

Better yet\, the following demonstrates how concatenation two valid regex patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/' Unmatched [ in regex; marked by \<-- HERE in m/(?^​:[ \<-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a regexp object into qr// normally interpolates the meaning of the regexp as a syntactically atomic element\, and the stringification of a regexp object has parens matching that behaviour. But here it turns out it's doing a dirty old string concatenation underneath.

-zefram

p5pRT commented 9 years ago

From @abigail

On Tue\, Sep 15\, 2015 at 07​:52​:41AM -0600\, Karl Williamson wrote​:

On 09/14/2015 01​:02 PM\, Karl Williamson wrote​:

Can you tell what the attached pattern would match without trying it? The answer and commentary will be posted tomorrow or so.

Several people were on the right track\, but no one knew that the comment
itself affected what gets compiled.

I discovered this while working on https://rt-archive.perl.org/perl5/Ticket/Display.html?id=8904 "Missing warning for /[[​:digit]]/"

The problem is that [[​:posix​:]] classes have a very strict syntax\, any
deviation from which is silently interpreted as something else. Abigail
asked for a warning in this ticket from 13 years ago. Abigail usually
does not like new warnings.

I don't\, but I may have thought differently 13 years ago. I had completely forgotten about this ticket.

I thought we'd agreed to add the warning to be in effect only if use re
'strict' is in effect\, but the discovery of the case presented in the
puzzle has made me think this needs more discussion.

[ Snip ]

Comments should not affect the behavior of code. So at a minimum this
should be fixed. But what else qualifies for cleaning up this mess
without someone having to 'use re "strict"' ?

Perhaps we should not special case [= =] and [. .]. Perl doesn't support it\, and warning that we don't support it sounds a bit silly. There are millions of constructs Perl doesn't support\, and we don't warn about that either.

Has there every been a serious request that Perl should support the [= =] and [. .] classes?

Abigail

p5pRT commented 9 years ago

From @khwilliamson

On 09/16/2015 10​:40 AM\, Zefram wrote​:

Eric Brine wrote​:

Better yet\, the following demonstrates how concatenation two valid regex patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/' Unmatched [ in regex; marked by \<-- HERE in m/(?^​:[ \<-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a regexp object into qr// normally interpolates the meaning of the regexp as a syntactically atomic element\, and the stringification of a regexp object has parens matching that behaviour. But here it turns out it's doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation\, if you look at the message it shows that $b was stringified into (?^​: The problem is that the parsing code is using a defective pattern to find its expected closing delimiter\, a pattern that unfortunately is used in various places in perl's parsing. You can't generally just use strchr or memchr\, or in this case a while loop to look ahead without bothering to see what's in the middle. strchr suffers from the additional problem that NULs are legitimate in Perl strings.

p5pRT commented 9 years ago

From @ap

* Abigail \abigail@&#8203;abigail\.be [2015-09-16 20​:40]​:

Perl doesn't support it\, and warning that we don't support it sounds a bit silly. There are millions of constructs Perl doesn't support\, and we don't warn about that either.

It makes sense when you assume that someone will port code from another language (which has POSIX regexps) to Perl somewhat mechanically. Say\, with a2p.

The same notion must have been what led to the ~150 stubs in POSIX.pm which just die saying “this function is not supported” (and “maybe you are looking for this instead”).

This mindset makes sense for an ascendant language looking to establish itself by stealing existing codebases from languages in the same niche.

As a point of comparison\, consider the tons of “this looks like you were writing Perl 5 here; did you want this other construct maybe?” warning/ error messages in Perl 6.

So it made sense for Perl 5 to do this back when it was very young.

But it’s long past being in that position.

So it makes sense for it to stop doing that now.

Regards\, -- Aristotle Pagaltzis // \<http​://plasmasturm.org/>

p5pRT commented 8 years ago

From @jkeenan

On Wed Sep 16 11​:40​:33 2015\, public@​khwilliamson.com wrote​:

On 09/16/2015 10​:40 AM\, Zefram wrote​:

Eric Brine wrote​:

Better yet\, the following demonstrates how concatenation two valid regex patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/' Unmatched [ in regex; marked by \<-- HERE in m/(?^​:[ \<-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a regexp object into qr// normally interpolates the meaning of the regexp as a syntactically atomic element\, and the stringification of a regexp object has parens matching that behaviour. But here it turns out it's doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation\, if you look at the message it shows that $b was stringified into (?^​: The problem is that the parsing code is using a defective pattern to find its expected closing delimiter\, a pattern that unfortunately is used in various places in perl's parsing. You can't generally just use strchr or memchr\, or in this case a while loop to look ahead without bothering to see what's in the middle. strchr suffers from the additional problem that NULs are legitimate in Perl strings.

khw​: Do you have any thoughts as to how we might resolve this ticket?

Thank you very much.

-- James E Keenan (jkeenan@​cpan.org)

p5pRT commented 8 years ago

From @khwilliamson

On 1/22/2016 8​:42 PM\, James E Keenan via RT wrote​:

On Wed Sep 16 11​:40​:33 2015\, public@​khwilliamson.com wrote​:

On 09/16/2015 10​:40 AM\, Zefram wrote​:

Eric Brine wrote​:

Better yet\, the following demonstrates how concatenation two valid regex patterns producing an invalid one​:

It's even worse than that​:

$ perl -e '$a=qr/[abc[=]/; $b=qr/=]/; qr/$a$b/' Unmatched [ in regex; marked by \<-- HERE in m/(?^​:[ \<-- HERE abc[=])(?^​:=])/ at -e line 1.

Fails without any explicit string concatenation. Interpolation of a regexp object into qr// normally interpolates the meaning of the regexp as a syntactically atomic element\, and the stringification of a regexp object has parens matching that behaviour. But here it turns out it's doing a dirty old string concatenation underneath.

-zefram

Hmm. Actually it's not doing a string concatenation\, if you look at the message it shows that $b was stringified into (?^​: The problem is that the parsing code is using a defective pattern to find its expected closing delimiter\, a pattern that unfortunately is used in various places in perl's parsing. You can't generally just use strchr or memchr\, or in this case a while loop to look ahead without bothering to see what's in the middle. strchr suffers from the additional problem that NULs are legitimate in Perl strings.

khw​: Do you have any thoughts as to how we might resolve this ticket?

Thank you very much.

I am well along on a fix to this.

p5pRT commented 8 years ago

From @khwilliamson

Fixed by 46d34d0e1e7de87f74f8b2df4b32f291baf21dbb -- Karl Williamson

p5pRT commented 8 years ago

@khwilliamson - Status changed from 'open' to 'pending release'

p5pRT commented 8 years ago

From @rjbs

* Karl Williamson via RT \perlbug\-followup@&#8203;perl\.org [2016-02-10T01​:48​:11]

Fixed by 46d34d0e1e7de87f74f8b2df4b32f291baf21dbb

Fantastic\, thanks Karl! This has bitten me on more than one occasion.

Also\, just nice to see an actual fix for a bug that has lingered for 14 years.

-- rjbs

p5pRT commented 8 years ago

From @khwilliamson

Thank you for submitting this report. You have helped make Perl better.  
With the release of Perl 5.24.0 on May 9\, 2016\, this and 149 other issues have been resolved.

Perl 5.24.0 may be downloaded via https://metacpan.org/release/RJBS/perl-5.24.0

p5pRT commented 8 years ago

@khwilliamson - Status changed from 'pending release' to 'resolved'