Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 559 forks source link

\U ... \Q ... \E ... \E #8846

Open p5pRT opened 17 years ago

p5pRT commented 17 years ago

Migrated from rt.perl.org#42043 (status was 'open')

Searchable as RT42043$

p5pRT commented 17 years ago

From @abigail

Created by @abigail

As far as I know\, the effect of a \U\, \L or \Q is cancelled by the first \E encountered\, and no nesting happens.

And as long as you use \U and \L\, it works this way. However\, \Q seems to behave differently. \Q seems to match with a matching \E\, and work from the inside out.

I don't know what "\Q-\Q-\E-\E-" is supposed to be\, both '\-\---' and '\-\\Q\---' have some logic (although the former would be my favourite); however\, currently "\Q-\Q-\E-\E-" equals '\-\\\-\--'\, which seems plain wrong to me.

The following tests fail 5 times (tests 2\, 5\, 7\, 8\, and 9)\, while I think they should all pass.

  #!/usr/bin/perl

  use strict;   use warnings;   no warnings 'syntax';

  use Test​::More tests => 9;

  is "Aa-\UBb-\LCc-\EDd-\EEe-"\, 'Aa-BB-cc-Dd-Ee-'\, '\L after \U';   is "Aa-\UBb-\QCc-\EDd-\EEe-"\, 'Aa-BB-CC\-Dd-Ee-'\, '\Q after \U';   is "Aa-\UBb-\UCc-\EDd-\EEe-"\, 'Aa-BB-CC-Dd-Ee-'\, '\U after \U';

  is "Aa-\LBb-\UCc-\EDd-\EEe-"\, 'Aa-bb-CC-Dd-Ee-'\, '\U after \L';   is "Aa-\LBb-\QCc-\EDd-\EEe-"\, 'Aa-bb-cc\-Dd-Ee-'\, '\Q after \L';   is "Aa-\LBb-\LCc-\EDd-\EEe-"\, 'Aa-bb-cc-Dd-Ee-'\, '\L after \L';

  is "Aa-\QBb-\UCc-\EDd-\EEe-"\, 'Aa-Bb\-CC\-Dd-Ee-'\, '\U after \Q';   is "Aa-\QBb-\LCc-\EDd-\EEe-"\, 'Aa-Bb\-cc\-Dd-Ee-'\, '\L after \Q';   is "Aa-\QBb-\QCc-\EDd-\EEe-"\, 'Aa-Bb\-Cc\-Dd-Ee-'\, '\Q after \Q';

  __END__  

Perl Info ``` Flags: category=core severity=low Site configuration information for perl v5.8.8: Configured by abigail at Fri Feb 3 23:37:31 CET 2006. Summary of my perl5 (revision 5 version 8 subversion 8) configuration: Platform: osname=linux, osvers=2.4.18-bf2.4, archname=i686-linux-64int-ld uname='linux alexandra 2.4.18-bf2.4 #1 son apr 14 09:53:28 cest 2002 i686 unknown ' config_args='-des -Dusemorebits -Uversiononly -Dmydomain=.abigail.be -Dcf_email=abigail@abigail.be -Dperladmin=abigail@abigail.be -Doptimize=-g -Dcc=gcc -Dprefix=/opt/perl' hint=recommended, useposix=true, d_sigaction=define usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=define usemymalloc=n, bincompat5005=undef Compiler: cc='gcc', ccflags ='-DDEBUGGING -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-g', cppflags='-DDEBUGGING -fno-strict-aliasing -pipe -Wdeclaration-after-statement -I/usr/local/include' ccversion='', gccversion='3.0.4', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='gcc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.2.5.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.2.5' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Locally applied patches: defined-or @INC for perl v5.8.8: /home/abigail/Perl /opt/perl/lib/5.8.8/i686-linux-64int-ld /opt/perl/lib/5.8.8 /opt/perl/lib/site_perl/5.8.8/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.8 /opt/perl/lib/site_perl/5.8.7/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.7 /opt/perl/lib/site_perl/5.8.6/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.6 /opt/perl/lib/site_perl/5.8.5/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.5 /opt/perl/lib/site_perl/5.8.4/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.4 /opt/perl/lib/site_perl/5.8.3/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.3 /opt/perl/lib/site_perl/5.8.2/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.2 /opt/perl/lib/site_perl/5.8.1/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.1 /opt/perl/lib/site_perl/5.8.0/i686-linux-64int-ld /opt/perl/lib/site_perl/5.8.0 /opt/perl/lib/site_perl . Environment for perl v5.8.8: HOME=/home/abigail LANG=C LANGUAGE (unset) LD_LIBRARY_PATH=/home/abigail/Lib:/usr/local/lib:/usr/lib:/lib:/usr/X11R6/lib LOGDIR (unset) PATH=/home/abigail/Bin:/opt/perl/bin:/usr/local/bin:/usr/local/X11/bin:/usr/bin:/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/X11R6/bin:/usr/games:/usr/share/texmf/bin:/opt/Acrobat/bin:/opt/java/blackdown/j2sdk1.3.1/bin:/usr/local/games/bin PERL5LIB=/home/abigail/Perl PERLDIR=/opt/perl PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 17 years ago

From rick@bort.ca

On Mar 23 2007\, Abigail wrote​:

As far as I know\, the effect of a \U\, \L or \Q is cancelled by the first \E encountered\, and no nesting happens.

And as long as you use \U and \L\, it works this way. However\, \Q seems to behave differently. \Q seems to match with a matching \E\, and work from the inside out.

I don't know what "\Q-\Q-\E-\E-" is supposed to be\, both '\-\---' and '\-\\Q\---' have some logic (although the former would be my favourite); however\, currently "\Q-\Q-\E-\E-" equals '\-\\\-\--'\, which seems plain wrong to me.

I think this is all covered in L\<perlop/"Gory details of parsing quoted constructs">. Basically\, everything after \Q is double-quote interpolated before escaping\, which includes interpolating \Q sequences. So these are all equivalent​:

  print "\Q-\Q-\E-\E";   print quotemeta("-\Q-\E-\E");   print quotemeta("-" . "\Q-\E-\E");   print quotemeta("-" . quotemeta("-\E") . "-\E");

-- Rick Delaney rick@​bort.ca

p5pRT commented 17 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 17 years ago

From david@landgren.net

Rick Delaney a écrit :

On Mar 23 2007\, Abigail wrote​:

As far as I know\, the effect of a \U\, \L or \Q is cancelled by the first \E encountered\, and no nesting happens.

And as long as you use \U and \L\, it works this way. However\, \Q seems to behave differently. \Q seems to match with a matching \E\, and work from the inside out.

I don't know what "\Q-\Q-\E-\E-" is supposed to be\, both '\-\---' and '\-\\Q\---' have some logic (although the former would be my favourite); however\, currently "\Q-\Q-\E-\E-" equals '\-\\\-\--'\, which seems plain wrong to me.

I think this is all covered in L\<perlop/"Gory details of parsing quoted constructs">. Basically\, everything after \Q is double-quote interpolated before escaping\, which includes interpolating \Q sequences. So these are all equivalent​:

print "\\Q\-\\Q\-\\E\-\\E";
print quotemeta\("\-\\Q\-\\E\-\\E"\);
print quotemeta\("\-" \. "\\Q\-\\E\-\\E"\);
print quotemeta\("\-" \. quotemeta\("\-\\E"\) \. "\-\\E"\);

I think the only real bug in all this is

% perl -le "print qq{\Un\lext}" NEXT

According to the docs\, that should print "NeXT". I cam see where and why in the parser it comes out like this\, but at this late stage of the game I think it would be better to amend to documentation to say that \l and \u are ignored if a \U or \L are in force.

There's a bug (#9360) open on this issue.

David

bram-perl commented 2 years ago

There's a bug (#9360) open on this issue.

For reference: in github this is actually issue #5467