Perl / perl5

šŸŖ The Perl programming language
https://dev.perl.org/perl5/
Other
1.91k stars 542 forks source link

silence warnings in build and test #12272

Closed p5pRT closed 12 years ago

p5pRT commented 12 years ago

Migrated from rt.perl.org#114128 (status was 'resolved')

Searchable as RT114128$

p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

Created by rmbarker.cpan@btinternet.com

I see the following warnings. The attached patches silence them.

pp_ctl.c​: In function ā€˜Perl_find_runcv_whereā€™​: pp_ctl.c​:3282​:22​: warning​: cast from pointer to integer of different size [-Wpointer-to-int-cast]

cpan/ExtUtils-MakeMaker/t/MM_OS2 .............................. Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/"DL_FUNCS" => { \<-- HERE }/ at t/MM_OS2.t line 45.

ext/Pod-Html/t/feature ........................................ --libpods is no longer supported at ../../lib/Pod/Html.pm line 517.

lib/Class/Struct .............................................. function 'count' already defined\, overrides struct accessor method at ../lib/Class/Struct.t line 41.

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Perl Info ``` Flags: category=install severity=low Site configuration information for perl 5.17.2: Configured by robin at Thu Jul 12 16:38:20 BST 2012. Summary of my perl5 (revision 5 version 17 subversion 2) configuration: Commit id: 20da2e5b86ba0d7b84f7fde98446cf2287b33741 Platform: osname=linux, osvers=2.6.38-15-generic, archname=i686-linux-64int uname='linux spade-ubuntu 2.6.38-15-generic #61-ubuntu smp tue jun 12 19:15:11 utc 2012 i686 i686 i386 gnulinux ' config_args='-ders -Dcc=gcc -Dusedevel -Doptimize=-O2 -DDEBUGGING -Duse64bitint -Dman3dir=none -Dcf_email=rmbarker.cpan@btinternet.com' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=undef, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -g', cppflags='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.7.0', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /usr/lib/i386-linux-gnu /usr/lib64 libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.13' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector' Locally applied patches: @INC for perl 5.17.2: lib /usr/local/lib/perl5/site_perl/5.17.2/i686-linux-64int /usr/local/lib/perl5/site_perl/5.17.2 /usr/local/lib/perl5/5.17.2/i686-linux-64int /usr/local/lib/perl5/5.17.2 /usr/local/lib/perl5/site_perl . Environment for perl 5.17.2: HOME=/home/robin LANG=en_GB.UTF-8 LANGUAGE=en_GB:en LD_LIBRARY_PATH=/usr/lib:/usr/local/lib LOGDIR (unset) PATH=/home/robin/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games PERL_BADLANG (unset) SHELL=/bin/tcsh ```
p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

0003-override-warning.patch ```diff From 15b6054992a958bb4fa0a813bf9b959d0d038c5d Mon Sep 17 00:00:00 2001 From: Robin Barker Date: Wed, 11 Jul 2012 00:32:36 +0100 Subject: [PATCH 3/5] silence override warning The test is trying to test the behaviour being warned about, so the code is right but the warning can be suppressed. --- lib/Class/Struct.t | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/lib/Class/Struct.t b/lib/Class/Struct.t index 111c090..3f0dbde 100644 --- a/lib/Class/Struct.t +++ b/lib/Class/Struct.t @@ -38,7 +38,10 @@ use Class::Struct s => '$', a => '@', h => '%', c => 'aClass'; package OverrideAccessor; use Class::Struct; -struct( 'OverrideAccessor', { count => '$' } ); +{ + no warnings qw(Class::Struct); + struct( 'OverrideAccessor', { count => '$' } ); +} sub count { my ($self,$count) = @_; -- 1.7.4.1 ```
p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

0005-escape-in-regexps.patch ```diff From f6cd558f2a275989c1b630a722f61bc25b486e5d Mon Sep 17 00:00:00 2001 From: Robin Barker Date: Wed, 11 Jul 2012 00:34:11 +0100 Subject: [PATCH 5/5] silence warning about escape in regexps Add escapes in regexps as suggested by the warnings and escape . as explicit '.' to match expected string --- cpan/ExtUtils-MakeMaker/t/MM_OS2.t | 2 +- lib/DB.t | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/cpan/ExtUtils-MakeMaker/t/MM_OS2.t b/cpan/ExtUtils-MakeMaker/t/MM_OS2.t index 4d88e85..08d0e1a 100644 --- a/cpan/ExtUtils-MakeMaker/t/MM_OS2.t +++ b/cpan/ExtUtils-MakeMaker/t/MM_OS2.t @@ -42,7 +42,7 @@ delete $mm->{SKIPHASH}; my $res = $mm->dlsyms(); like( $res, qr/baseext\.def: Makefile/, '... without flag, should return make targets' ); -like( $res, qr/"DL_FUNCS" => { }/, +like( $res, qr/"DL_FUNCS" => \{ \}/, '... should provide empty hash refs where necessary' ); like( $res, qr/"DL_VARS" => \[]/, '... and empty array refs too' ); diff --git a/lib/DB.t b/lib/DB.t index a1fadf3..d50c215 100644 --- a/lib/DB.t +++ b/lib/DB.t @@ -126,7 +126,7 @@ is( DB::_clientname('bar'), undef, my @ret = eval { DB->backtrace() }; like( $ret[0], qr/file.+\Q$0\E/, 'DB::backtrace() should report current file'); like( $ret[0], qr/line $line/, '... should report calling line number' ); - like( $ret[0], qr/eval {...}/, '... should catch eval BLOCK' ); + like( $ret[0], qr/eval \{\.\.\.\}/, '... should catch eval BLOCK' ); @ret = eval "one(2)"; is( scalar @ret, 1, '... should report from provided stack frame number' ); -- 1.7.4.1 ```
p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

0002-no-libpods.patch ```diff From c269cd5aa9ce7cfdf4cc2eb9be23c718856b4c76 Mon Sep 17 00:00:00 2001 From: Robin Barker Date: Wed, 11 Jul 2012 00:31:05 +0100 Subject: [PATCH 2/5] silence warning about use of --libpods --lipods warns, and I don't think it is needed in this test, so I have removed it --- ext/Pod-Html/t/feature.t | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/ext/Pod-Html/t/feature.t b/ext/Pod-Html/t/feature.t index 9096523..94354c7 100644 --- a/ext/Pod-Html/t/feature.t +++ b/ext/Pod-Html/t/feature.t @@ -21,7 +21,6 @@ convert_n_test("feature", "misc pod-html features", "--podroot=$cwd", "--title=a title", "--quiet", - "--libpods=perlguts:perlootut", ); __DATA__ -- 1.7.4.1 ```
p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

0001-compiler-warning.patch ```diff From b9f880012ad319424a138550da091c3aea801808 Mon Sep 17 00:00:00 2001 From: Robin Barker Date: Wed, 11 Jul 2012 00:29:56 +0100 Subject: [PATCH 1/5] silence compiler warning - casting void* to IV This warns when void* and IV have different sizes There is already a macro to do this conversion cleanly: PTR2IV --- pp_ctl.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/pp_ctl.c b/pp_ctl.c index 54f17ae..826a772 100644 --- a/pp_ctl.c +++ b/pp_ctl.c @@ -3279,7 +3279,7 @@ Perl_find_runcv_where(pTHX_ U8 cond, void *arg, U32 *db_seqp) if (CvROOT(cv) != (OP *)arg) continue; return cv; case FIND_RUNCV_level_eq: - if (level++ != (IV)arg) continue; + if (level++ != PTR2IV(arg)) continue; /* GERONIMO! */ default: return cv; -- 1.7.4.1 ```
p5pRT commented 12 years ago

From @cpansprout

On Fri Jul 13 07​:55​:01 2012\, rmbarker.cpan@​btinternet.com wrote​:

This is a bug report for perl from rmbarker.cpan@​btinternet.com\, generated with the help of perlbug 1.39 running under perl 5.17.2.

----------------------------------------------------------------- [Please describe your issue here]

I see the following warnings. The attached patches silence them.

pp_ctl.c​: In function ā€˜Perl_find_runcv_whereā€™​: pp_ctl.c​:3282​:22​: warning​: cast from pointer to integer of different size [-Wpointer-to-int-cast]

cpan/ExtUtils-MakeMaker/t/MM_OS2 .............................. Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/"DL_FUNCS" => { \<-- HERE }/ at t/MM_OS2.t line 45.

ext/Pod-Html/t/feature ........................................ --libpods is no longer supported at ../../lib/Pod/Html.pm line 517.

lib/Class/Struct .............................................. function 'count' already defined\, overrides struct accessor method at ../lib/Class/Struct.t line 41.

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Thank you. I have applied three of these as 80544570\, 69b9dcc and 188f5f276.

Iā€™m not so sure about the { patch. One of the files it touches has CPAN upstream. And there are still at least two people (Yves Orton and I) who think the warning should be reverted.

--

Father Chrysostomos

p5pRT commented 12 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 12 years ago

From @iabyn

On Fri\, Jul 13\, 2012 at 09​:58​:49AM -0700\, Father Chrysostomos via RT wrote​:

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Iā€™m not so sure about the { patch. One of the files it touches has CPAN upstream. And there are still at least two people (Yves Orton and I) who think the warning should be reverted.

Three\, including me.

-- No man treats a motor car as foolishly as he treats another human being. When the car will not go\, he does not attribute its annoying behaviour to sin\, he does not say\, You are a wicked motorcar\, and I shall not give you any more petrol until you go. He attempts to find out what is wrong and set it right.   -- Bertrand Russell\,   Has Religion Made Useful Contributions to Civilization?

p5pRT commented 12 years ago

From @chipdude

On Fri\, Jul 13\, 2012 at 08​:43​:13PM +0100\, Dave Mitchell wrote​:

On Fri\, Jul 13\, 2012 at 09​:58​:49AM -0700\, Father Chrysostomos via RT wrote​:

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Iā€™m not so sure about the { patch. One of the files it touches has CPAN upstream. And there are still at least two people (Yves Orton and I) who think the warning should be reverted.

Three\, including me.

-1 from me\, since we're voting. I like it. Mildly. -- Chip Salzenberg

p5pRT commented 12 years ago

From @cpansprout

On Fri Jul 13 12​:50​:54 2012\, rev.chip@​gmail.com wrote​:

On Fri\, Jul 13\, 2012 at 08​:43​:13PM +0100\, Dave Mitchell wrote​:

On Fri\, Jul 13\, 2012 at 09​:58​:49AM -0700\, Father Chrysostomos via RT wrote​:

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Iā€™m not so sure about the { patch. One of the files it touches has CPAN upstream. And there are still at least two people (Yves Orton and I) who think the warning should be reverted.

Three\, including me.

-1 from me\, since we're voting. I like it. Mildly.

It wouldnā€™t be so bad if it affected only /\\[a-zA-Z]\{/.

--

Father Chrysostomos

p5pRT commented 12 years ago

From @khwilliamson

On 07/13/2012 02​:12 PM\, Father Chrysostomos via RT wrote​:

On Fri Jul 13 12​:50​:54 2012\, rev.chip@​gmail.com wrote​:

On Fri\, Jul 13\, 2012 at 08​:43​:13PM +0100\, Dave Mitchell wrote​:

On Fri\, Jul 13\, 2012 at 09​:58​:49AM -0700\, Father Chrysostomos via RT wrote​:

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t line 129.

Iā€™m not so sure about the { patch. One of the files it touches has CPAN upstream. And there are still at least two people (Yves Orton and I) who think the warning should be reverted.

Three\, including me.

-1 from me\, since we're voting. I like it. Mildly.

It wouldnā€™t be so bad if it affected only /\\[a-zA-Z]\{/.

I just tried to re-read all the related emails for this. I can't find the place where Yves apparently changed his mind from thinking this might be ok to do\, to not thinking so.

His original point was that there are a number of issues involved\, and maybe we should look at the overall picture. Failing that\, I don't see what was wrong with the idea\, I believe originally Zefram's\, to not strip the escapes from pattern delimiter characters which occur internally in the pattern. If a program specifies there is a backslash\, it gets stripped before being parsed by the regex engine. That's got to be a bug\, even if it is documented in one of the several places where it should be.

Making the warning only apply if the brace was preceded by an alpha would give us the ability to make changes at some point so that \w{foo} could exist\, but we lose the ability to fix the quantifier syntax {m\,n} so that we can catch typos involving it\, and we wouldn't be able to allow enhancements\, such as {\,n}

p5pRT commented 12 years ago

From @demerphq

On 15 July 2012 06​:45\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

On 07/13/2012 02​:12 PM\, Father Chrysostomos via RT wrote​:

On Fri Jul 13 12​:50​:54 2012\, rev.chip@​gmail.com wrote​:

On Fri\, Jul 13\, 2012 at 08​:43​:13PM +0100\, Dave Mitchell wrote​:

On Fri\, Jul 13\, 2012 at 09​:58​:49AM -0700\, Father Chrysostomos via RT

wrote​:

lib/DB ........................................................ Unescaped left brace in regex is deprecated\, passed through in

regex;

marked by \<-- HERE in m/eval { \<-- HERE ...}/ at ../lib/DB.t

line 129.

Iā€™m not so sure about the { patch. One of the files it touches

has CPAN

upstream. And there are still at least two people (Yves Orton and

I)

who think the warning should be reverted.

Three\, including me.

-1 from me\, since we're voting. I like it. Mildly.

It wouldnā€™t be so bad if it affected only /\\[a-zA-Z]\{/.

I just tried to re-read all the related emails for this. I can't find the place where Yves apparently changed his mind from thinking this might be ok to do\, to not thinking so.

His original point was that there are a number of issues involved\, and maybe we should look at the overall picture. Failing that\, I don't see what was wrong with the idea\, I believe originally Zefram's\, to not strip the escapes from pattern delimiter characters which occur internally in the pattern. If a program specifies there is a backslash\, it gets stripped before being parsed by the regex engine. That's got to be a bug\, even if it is documented in one of the several places where it should be.

Making the warning only apply if the brace was preceded by an alpha would give us the ability to make changes at some point so that \w{foo} could exist\, but we lose the ability to fix the quantifier syntax {m\,n} so that we can catch typos involving it\, and we wouldn't be able to allow enhancements\, such as {\,n}

I changed my mind once I realized there was no work around inside of m{} style matches which are very very common.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 12 years ago

From @nwc10

On Mon\, Jul 16\, 2012 at 08​:10​:59AM +0200\, demerphq wrote​:

On 15 July 2012 06​:45\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

I just tried to re-read all the related emails for this. I can't find the place where Yves apparently changed his mind from thinking this might be ok to do\, to not thinking so.

His original point was that there are a number of issues involved\, and maybe we should look at the overall picture. Failing that\, I don't see what was wrong with the idea\, I believe originally Zefram's\, to not strip the escapes from pattern delimiter characters which occur internally in the pattern. If a program specifies there is a backslash\, it gets stripped before being parsed by the regex engine. That's got to be a bug\, even if it is documented in one of the several places where it should be.

Making the warning only apply if the brace was preceded by an alpha would give us the ability to make changes at some point so that \w{foo} could exist\, but we lose the ability to fix the quantifier syntax {m\,n} so that we can catch typos involving it\, and we wouldn't be able to allow enhancements\, such as {\,n}

I changed my mind once I realized there was no work around inside of m{} style matches which are very very common.

For those following along at home and wondering "yes\, but for m{} the *close* brace is the terminator\, so why is *open* special?" (which was what I did)​:

$ ./perl -le 'print qr{ { }' Search pattern not terminated at -e line 1. $ ./perl -le 'print qr{ { } }' Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/ { \<-- HERE } / at -e line 1. (?^​: { } )

and\, on a perl that doesn't warn\, to be clearer​:

$ perl -lwe 'print qr{ { }' Search pattern not terminated at -e line 1. $ perl -lwe 'print qr{ { } }' (?-xism​: { } ) $ perl -lwe 'print qr{ \{ }' (?-xism​: { ) $ perl -lwe 'print qr{ \{ } }' Unmatched right curly bracket at -e line 1\, at end of line syntax error at -e line 1\, near "qr{ \{ } }" Execution of -e aborted due to compilation errors. $ perl -lwe 'print qr{ \{ \} }' (?-xism​: { } )

[Once again\, advice from Klortho proves valuable :-) #11949 What happens when you try? I'd forgotten most of the behaviour above]

The problem is that we *also* have a feature whereby matched delimiters *nest*\, and backslash escaping of an opening delimiter overrides the nesting. Which I'd forgotten the details of.

And\, again\, as this is escaping at the delimiter processing level\, because the same character (backslash) is being used to escape delimiters\, itself\, and make metacharacters within the pattern\, I don't think that there is a solution that offers everything.

Nicholas Clark

p5pRT commented 12 years ago

From @khwilliamson

On 07/16/2012 02​:45 AM\, Nicholas Clark wrote​:

On Mon\, Jul 16\, 2012 at 08​:10​:59AM +0200\, demerphq wrote​:

On 15 July 2012 06​:45\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

I just tried to re-read all the related emails for this. I can't find the place where Yves apparently changed his mind from thinking this might be ok to do\, to not thinking so.

His original point was that there are a number of issues involved\, and maybe we should look at the overall picture. Failing that\, I don't see what was wrong with the idea\, I believe originally Zefram's\, to not strip the escapes from pattern delimiter characters which occur internally in the pattern. If a program specifies there is a backslash\, it gets stripped before being parsed by the regex engine. That's got to be a bug\, even if it is documented in one of the several places where it should be.

Making the warning only apply if the brace was preceded by an alpha would give us the ability to make changes at some point so that \w{foo} could exist\, but we lose the ability to fix the quantifier syntax {m\,n} so that we can catch typos involving it\, and we wouldn't be able to allow enhancements\, such as {\,n}

I changed my mind once I realized there was no work around inside of m{} style matches which are very very common.

For those following along at home and wondering "yes\, but for m{} the *close* brace is the terminator\, so why is *open* special?" (which was what I did)​:

$ ./perl -le 'print qr{ { }' Search pattern not terminated at -e line 1. $ ./perl -le 'print qr{ { } }' Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/ { \<-- HERE } / at -e line 1. (?^​: { } )

and\, on a perl that doesn't warn\, to be clearer​:

$ perl -lwe 'print qr{ { }' Search pattern not terminated at -e line 1. $ perl -lwe 'print qr{ { } }' (?-xism​: { } ) $ perl -lwe 'print qr{ \{ }' (?-xism​: { ) $ perl -lwe 'print qr{ \{ } }' Unmatched right curly bracket at -e line 1\, at end of line syntax error at -e line 1\, near "qr{ \{ } }" Execution of -e aborted due to compilation errors. $ perl -lwe 'print qr{ \{ \} }' (?-xism​: { } )

[Once again\, advice from Klortho proves valuable :-) #11949 What happens when you try? I'd forgotten most of the behaviour above]

The problem is that we *also* have a feature whereby matched delimiters *nest*\, and backslash escaping of an opening delimiter overrides the nesting. Which I'd forgotten the details of.

And\, again\, as this is escaping at the delimiter processing level\, because the same character (backslash) is being used to escape delimiters\, itself\, and make metacharacters within the pattern\, I don't think that there is a solution that offers everything.

Nicholas Clark

Thanks for clarifying things.

I don't see\, however\, any conflict with this\, and requiring a literal backslash in a regex to be escaped. If the lexer were changed so that it passed the backslashes through to the regex engine\, I think it all would work correctly\, except that the right bracket must be backslashed if the left one is.

I tried to find where the stripping is happening\, so I could change things and experiment\, but I don't understand the lexer very well\, and when I single step through it\, it seems like a maze of twisty little passages\, all different and the same. And setting breakpoints at what I thought were the obvious suspects didn't help.

p5pRT commented 12 years ago

From @khwilliamson

On 07/16/2012 11​:10 AM\, Karl Williamson wrote​:

On 07/16/2012 02​:45 AM\, Nicholas Clark wrote​:

On Mon\, Jul 16\, 2012 at 08​:10​:59AM +0200\, demerphq wrote​:

On 15 July 2012 06​:45\, Karl Williamson \public@&#8203;khwilliamson\.com wrote​:

I just tried to re-read all the related emails for this. I can't find the place where Yves apparently changed his mind from thinking this might be ok to do\, to not thinking so.

His original point was that there are a number of issues involved\, and maybe we should look at the overall picture. Failing that\, I don't see what was wrong with the idea\, I believe originally Zefram's\, to not strip the escapes from pattern delimiter characters which occur internally in the pattern. If a program specifies there is a backslash\, it gets stripped before being parsed by the regex engine. That's got to be a bug\, even if it is documented in one of the several places where it should be.

Making the warning only apply if the brace was preceded by an alpha would give us the ability to make changes at some point so that \w{foo} could exist\, but we lose the ability to fix the quantifier syntax {m\,n} so that we can catch typos involving it\, and we wouldn't be able to allow enhancements\, such as {\,n}

I changed my mind once I realized there was no work around inside of m{} style matches which are very very common.

For those following along at home and wondering "yes\, but for m{} the *close* brace is the terminator\, so why is *open* special?" (which was what I did)​:

$ ./perl -le 'print qr{ { }' Search pattern not terminated at -e line 1. $ ./perl -le 'print qr{ { } }' Unescaped left brace in regex is deprecated\, passed through in regex; marked by \<-- HERE in m/ { \<-- HERE } / at -e line 1. (?^​: { } )

and\, on a perl that doesn't warn\, to be clearer​:

$ perl -lwe 'print qr{ { }' Search pattern not terminated at -e line 1. $ perl -lwe 'print qr{ { } }' (?-xism​: { } ) $ perl -lwe 'print qr{ \{ }' (?-xism​: { ) $ perl -lwe 'print qr{ \{ } }' Unmatched right curly bracket at -e line 1\, at end of line syntax error at -e line 1\, near "qr{ \{ } }" Execution of -e aborted due to compilation errors. $ perl -lwe 'print qr{ \{ \} }' (?-xism​: { } )

[Once again\, advice from Klortho proves valuable :-) #11949 What happens when you try? I'd forgotten most of the behaviour above]

The problem is that we *also* have a feature whereby matched delimiters *nest*\, and backslash escaping of an opening delimiter overrides the nesting. Which I'd forgotten the details of.

And\, again\, as this is escaping at the delimiter processing level\, because the same character (backslash) is being used to escape delimiters\, itself\, and make metacharacters within the pattern\, I don't think that there is a solution that offers everything.

Nicholas Clark

Thanks for clarifying things.

I don't see\, however\, any conflict with this\, and requiring a literal backslash in a regex to be escaped. If the lexer were changed so that

Just reread this\, and I misspoke. It should be​:

I don't see\, however\, any conflict with this\, and requiring a literal left brace in a regex to be escaped. If the lexer were changed so that

it passed the backslashes through to the regex engine\, I think it all would work correctly\, except that the right bracket must be backslashed if the left one is.

I tried to find where the stripping is happening\, so I could change things and experiment\, but I don't understand the lexer very well\, and when I single step through it\, it seems like a maze of twisty little passages\, all different and the same. And setting breakpoints at what I thought were the obvious suspects didn't help.

p5pRT commented 12 years ago

From @iabyn

On Mon\, Jul 16\, 2012 at 11​:10​:53AM -0600\, Karl Williamson wrote​:

I tried to find where the stripping is happening\, so I could change things and experiment\, but I don't understand the lexer very well\, and when I single step through it\, it seems like a maze of twisty little passages\, all different and the same. And setting breakpoints at what I thought were the obvious suspects didn't help.

Parsing literal regexps is a three phase process; the third is the regex engine itself; the first two are shared with literal strings.

The first phase is handling single-quoted strings​: basically just handle delimiters; this is done in S_scan_str. The net result is to find the end of the string\, and store the string in PL_lex_stuff. \\ is passed through unchanged\, but \{ and \} are converted to { and } (assuming the delimiters are {}).

Then the lexer goes in a 'sublex' phase\, where it treats the body of the string (i.e. PL_lex_stuff) effectively as the source code to be parsed. Here\, it's the job of S_scan_const() to extract out the next constant chunk of the string and return it as an OP_CONST. It will stop when it sees something that looks like a variable\, or \Q for example. For a simple literal regex\, the whole pattern will be returned as a single OP_CONST; for something more complex like /a$b[3]c/\, this will be returned as OP_CONST[a]\, a sequence of ops to extract $b[3]\, and OP_CONST[c]; with the first call to scan_str() stopping at the '$b'\, and the second call stopping at the end of the string.

If the string/pattern is single-quoted\, then scan_str() does very little processing; if its a pattern\, it does a bit more processing\, but leaves all the \ processing to the regex engine; literal strings get all the \ escapes processed.

However\, consider the following​:

  print "matched\n" if "aa" =~ m{^a\{1\,2\}$};

Currently\, this matches​: the first pass extracts the string between the {}\, and strips the \\, leaving the string '^a{1\,2}$'; the second pass doesn't change anything\, then in the 3rd pass the regex compiler sees the {1\,2} as a quantifier.

If you changed it so that the backslashes weren't stripped in the first pass\, then the regex compiler would see it as '^a\{1\,2\}$'\, and instead the following would match​:

  print "matched\n" if "a{1\,2}" =~ m{^a\{1\,2\}$};

So if I understand your proposal correctly it would alter the matching behaviour of existing legal patterns.

Anyway\, if you want to experiment\, a good starting point would be​:

Inline Patch ```diff diff --git a/toke.c b/toke.c index 13d7ac2..95b2a04 100644 --- a/toke.c +++ b/toke.c @@ -9143,7 +9143,7 @@ S_scan_pat(pTHX_ char *start, I32 type) { dVAR; PMOP *pm; - char *s = scan_str(start,!!PL_madskills,FALSE, PL_reg_state.re_reparsing); + char *s = scan_str(start,1,FALSE, PL_reg_state.re_reparsing); const char * const valid_flags = (const char *)((type == OP_QR) ? QR_PAT_MODS : M_PAT_MODS); char charset = '\0'; /* character set modifier */ ```

which for literal patterns, would stop the first pass from stripping the \ from escaped delimiters .

-- I've often wanted to drown my troubles\, but I can't get my wife to go swimming.

p5pRT commented 12 years ago

From @khwilliamson

On 07/18/2012 09​:27 AM\, Dave Mitchell wrote​:

On Mon\, Jul 16\, 2012 at 11​:10​:53AM -0600\, Karl Williamson wrote​:

I tried to find where the stripping is happening\, so I could change things and experiment\, but I don't understand the lexer very well\, and when I single step through it\, it seems like a maze of twisty little passages\, all different and the same. And setting breakpoints at what I thought were the obvious suspects didn't help.

Parsing literal regexps is a three phase process; the third is the regex engine itself; the first two are shared with literal strings.

The first phase is handling single-quoted strings​: basically just handle delimiters; this is done in S_scan_str. The net result is to find the end of the string\, and store the string in PL_lex_stuff. \\ is passed through unchanged\, but \{ and \} are converted to { and } (assuming the delimiters are {}).

Then the lexer goes in a 'sublex' phase\, where it treats the body of the string (i.e. PL_lex_stuff) effectively as the source code to be parsed. Here\, it's the job of S_scan_const() to extract out the next constant chunk of the string and return it as an OP_CONST. It will stop when it sees something that looks like a variable\, or \Q for example. For a simple literal regex\, the whole pattern will be returned as a single OP_CONST; for something more complex like /a$b[3]c/\, this will be returned as OP_CONST[a]\, a sequence of ops to extract $b[3]\, and OP_CONST[c]; with the first call to scan_str() stopping at the '$b'\, and the second call stopping at the end of the string.

If the string/pattern is single-quoted\, then scan_str() does very little processing; if its a pattern\, it does a bit more processing\, but leaves all the \ processing to the regex engine; literal strings get all the \ escapes processed.

However\, consider the following​:

 print "matched\\n" if "aa" =~ m\{^a\\\{1\,2\\\}$\};

Currently\, this matches​: the first pass extracts the string between the {}\, and strips the \\, leaving the string '^a{1\,2}$'; the second pass doesn't change anything\, then in the 3rd pass the regex compiler sees the {1\,2} as a quantifier.

If you changed it so that the backslashes weren't stripped in the first pass\, then the regex compiler would see it as '^a\{1\,2\}$'\, and instead the following would match​:

 print "matched\\n" if "a\{1\,2\}" =~ m\{^a\\\{1\,2\\\}$\};

So if I understand your proposal correctly it would alter the matching behaviour of existing legal patterns.

Yes\, you are right about this\, and thank you for thinking of it. I see three possibilities​:

1) Consider this to be a show-stopper\, revert the warning\, and be stuck with the current behavior until and if we decide to do one of the other two possibilities. To wit​:

2) Revert the warning\, but deprecate this unnecessary escaping of braces for a cycle or two\, after which we institute the warning again for another cycle or two\, after which we prohibit unescaped left brackets.

3) Consider this as acceptable collateral breakage\, document it\, and keep the warning for a cycle or two\, after which we prohibit unescaped literal left brackets.

The final possibility is only feasible if there is very little current breakage. I used grep.cpan.me to try to find out. Below are my steps so that other people can check my work. I used the following regular expression to look for such quantifiers​:

\b(?​:m|s|qr)\s*{.*\\{\d+(\,\d*)?\\} No matches found.

I then looked for such occurrences such as \p\{property\}

\b(?​:m|s|qr)\s*{.*[^\\]\\[[​:alpha​:]]\\{.*\\} No matches found.

In the above\, [^\\]\\[[​:alpha​:]] looks for a backslash-letter\, but not a backslash-backslash-letter. This prevents the 5 false positives like this​: my @​times = $match_times =~ m{\\d\{(\d+)\}}g;

Finally\, I went looking for (?\{ and (??\{

\b(?​:m|s|qr)\s*{.*\(\?{1\,2}\\{ No matches found.

Am I missing anything?

Anyway\, if you want to experiment\, a good starting point would be​:

diff --git a/toke.c b/toke.c index 13d7ac2..95b2a04 100644 --- a/toke.c +++ b/toke.c @​@​ -9143\,7 +9143\,7 @​@​ S_scan_pat(pTHX_ char *start\, I32 type) { dVAR; PMOP *pm; - char *s = scan_str(start\,!!PL_madskills\,FALSE\, PL_reg_state.re_reparsing); + char *s = scan_str(start\,1\,FALSE\, PL_reg_state.re_reparsing); const char * const valid_flags = (const char *)((type == OP_QR) ? QR_PAT_MODS : M_PAT_MODS); char charset = '\0'; /* character set modifier */

which for literal patterns\, would stop the first pass from stripping the \ from escaped delimiters .

Thank you. I had set break points on scan_str\, but I guess I didn't know what I was looking for. If I apply just this patch\, and run it on the test suite\, the only failures we get are in (??{...}) constructs\, in re/pat_re_eval.t where it is expecting the backslashes to have been stripped. As a quick\, temporary work around\, I changed the code to strip except when what is being escaped is the same character as the left delimiter. Then all tests pass in my machine\, and it is currently in smoke-me.

With this patch\, all of Nicholas' examples work properly.

p5pRT commented 12 years ago

From @iabyn

On Thu\, Jul 19\, 2012 at 11​:15​:55AM -0600\, Karl Williamson wrote​:

Am I missing anything?

I think so. I'm presuming that changing the behaviour of stripping the backslash in escaped delimiters will apply to all delimiters\, not just {}\, so it will apply to things like

  m[...\[...\]...]   /...\/.../

I haven't stopped to think through the implications\, but I'm just flagging it as something for you to think about.

-- Please note that ash-trays are provided for the use of smokers\, whereas the floor is provided for the use of all patrons.   -- Bill Royston

p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

The original patch to lib/DB.t also escaped the '.' to match 'eval {...}'. To avoid the debate about escaping '{' using '\'\, I offer a different patch that escapes '{' and '.' using \Q.

p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

0001-rewrite-using-Q.patch ```diff From 8b8e74765d86d69e596466a7c4c4976afc5c8d46 Mon Sep 17 00:00:00 2001 From: Robin Barker Date: Thu, 19 Jul 2012 16:25:59 +0100 Subject: [PATCH] rewrite using \Q The regexp for the 'eval BLOCK' test should match explicit '...'. So avoid the \{ issue and put \Q before all the punctuation. --- lib/DB.t | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/DB.t b/lib/DB.t index ed7a0ca..b4b6ecb 100644 --- a/lib/DB.t +++ b/lib/DB.t @@ -126,7 +126,7 @@ is( DB::_clientname('bar'), undef, my @ret = eval { DB->backtrace() }; like( $ret[0], qr/file.+\Q$0\E/, 'DB::backtrace() should report current file'); like( $ret[0], qr/line $line/, '... should report calling line number' ); - like( $ret[0], qr/eval {...}/, '... should catch eval BLOCK' ); + like( $ret[0], qr/eval\Q {...}/, '... should catch eval BLOCK' ); @ret = eval "one(2)"; is( scalar @ret, 1, '... should report from provided stack frame number' ); -- 1.7.4.1 ```
p5pRT commented 12 years ago

From @nwc10

On Fri\, Jul 20\, 2012 at 01​:41​:56PM +0100\, Robin Barker wrote​:

The original patch to lib/DB.t also escaped the '.' to match 'eval {...}'. To avoid the debate about escaping '{' using '\'\, I offer a different patch that escapes '{' and '.' using \Q.

Thanks applied to blead\, although I slightly reworded the commit message to be a bit more verbose\, and reflowed the body​:

commit 7150f9197f27c7cc16a06b3e01391c49c78398ce Author​: Robin Barker \rmbarker@&#8203;cpan\.org Date​: Thu Jul 19 16​:25​:59 2012 +0100

  rewrite a regex in lib/DB.t using \Q  
  The regexp for the 'eval BLOCK' test should match explicit '...'.   So avoid the \{ issue and put \Q before all the punctuation.

Bother. I've just spotted the style inconsistency between regex and regexp. Too late now. Sorry world.

Nicholas Clark

p5pRT commented 12 years ago

From @chipdude

On 7/19/2012 10​:15 AM\, Karl Williamson wrote​:

3) Consider this as acceptable collateral breakage\, document it\, and keep the warning for a cycle or two\, after which we prohibit unescaped literal left brackets.

The final possibility is only feasible if there is very little current breakage.

My 2c​: I think this is ideal\, since stripping \ on {} in regexes seems like something we should never have done in the first place.

p5pRT commented 12 years ago

From @iabyn

On Fri\, Jul 20\, 2012 at 08​:35​:27PM -0700\, Reverend Chip wrote​:

On 7/19/2012 10​:15 AM\, Karl Williamson wrote​:

3) Consider this as acceptable collateral breakage\, document it\, and keep the warning for a cycle or two\, after which we prohibit unescaped literal left brackets.

The final possibility is only feasible if there is very little current breakage.

My 2c​: I think this is ideal\, since stripping \ on {} in regexes seems like something we should never have done in the first place.

Note that if we changed it so that escaped delimiters are no longer stripped\, *all* the following regexes would change their meaning; the last three would become compile errors\, while the first four would just silently start matching different things​:

  qr[^\[a-z\]$]

  qr# a   \#xxx   b   #x;

  qr(^\(x\)$);

  m?^xy\?$?

  qr!a(?\!b)!;

  qr\<a(?\\<foo\>b)>;

  qr|a(?\|foo)|;

-- Diplomacy is telling someone to go to hell in such a way that they'll look forward to the trip

p5pRT commented 12 years ago

From rmbarker.cpan@btinternet.com

I suggest this bug is marked as resolved. The only warnings now come from cpan/ modules. [perl#113094] is tracking the \{ issue for CPAN modules.

p5pRT commented 12 years ago

From @khwilliamson

resolved as requested

-- Karl Williamson

p5pRT commented 12 years ago

From [Unknown Contact. See original ticket]

resolved as requested

-- Karl Williamson

p5pRT commented 12 years ago

@khwilliamson - Status changed from 'open' to 'resolved'

p5pRT commented 12 years ago

From @chipdude

On 7/21/2012 6​:51 AM\, Dave Mitchell wrote​:

On Fri\, Jul 20\, 2012 at 08​:35​:27PM -0700\, Reverend Chip wrote​:

On 7/19/2012 10​:15 AM\, Karl Williamson wrote​:

3) Consider this as acceptable collateral breakage\, document it\, and keep the warning for a cycle or two\, after which we prohibit unescaped literal left brackets.

The final possibility is only feasible if there is very little current breakage. My 2c​: I think this is ideal\, since stripping \ on {} in regexes seems like something we should never have done in the first place. Note that if we changed it so that escaped delimiters are no longer stripped\, *all* the following regexes would change their meaning; the last three would become compile errors\, while the first four would just silently start matching different things​:

qr\[^\\\[a\-z\\\]$\]

qr\#  a
 \\\#xxx
 b
\#x;

qr\(^\\\(x\\\)$\);

m?^xy\\?$?

qr\!a\(?\\\!b\)\!;

qr\<a\(?\\\<foo\\>b\)>;

qr|a\(?\\|foo\)|;

No\, that's too much potential for user pain. I was mistaken.