Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.96k stars 555 forks source link

A bug on constant overloading #9184

Open p5pRT opened 16 years ago

p5pRT commented 16 years ago

Migrated from rt.perl.org#49594 (status was 'open')

Searchable as RT49594$

p5pRT commented 16 years ago

From g.psy.va@gmail.com

This is a bug report for perl from g.psy.va@​gmail.com\, generated with the help of perlbug 1.36 running under perl 5.10.0.

I've found a bug on overload​::constant() with escaped characters (e.g. "\n"\, "\t")

Please try to do the attached test file\, "tsh.t".

I did it on perl 5.8.8 (for Cygwin)\, and 5.10.0 (for Windows and Cygwin)\, but the result is 'FAIL'. (however\, I can't find what is broken)

Thanks.

Goro Fuji \g\.psy\.va@​gmail\.com


Flags​:   category=core   severity=low


Site configuration information for perl 5.10.0​:

Configured by garo at Sun Jan 6 08​:39​:07 JST 2008.

Summary of my perl5 (revision 5 version 10 subversion 0) configuration​:   Platform​:   osname=cygwin\, osvers=1.5.25(0.15642)\, archname=cygwin-thread-multi   uname='cygwin_nt-5.1 goro 1.5.25(0.15642) 2007-12-14 19​:21 i686 cygwin '   config_args='-de -Dmksymlnks -Doptimize=-O3 -Dusethreads -Acccdlflags=-s -Alddlflags=-s -Accdlflags=-s -Dman3ext=3pm'   hint=recommended\, useposix=true\, d_sigaction=define   useithreads=define\, usemultiplicity=define   useperlio=define\, d_sfio=undef\, uselargefiles=define\, usesocks=undef   use64bitint=undef\, use64bitall=undef\, uselongdouble=undef   usemymalloc=y\, bincompat5005=undef   Compiler​:   cc='gcc'\, ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -fno-strict-aliasing -pipe'\,   optimize='-O3'\,   cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -fno-strict-aliasing -pipe'   ccversion=''\, gccversion='3.4.4 (cygming special\, gdc 0.12\, using dmd 0.125)'\, gccosandvers=''   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8\, byteorder=1234   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='g++'\, ldflags =' -Wl\,--enable-auto-import -Wl\,--export-all-symbols -Wl\,--stack\,8388608 -Wl\,--enable-auto-image-base -Wl\,--enable-auto-import -L/usr/local/lib'   libpth=/usr/local/lib /usr/lib /lib   libs=-ldl -lcrypt   perllibs=-ldl -lcrypt   libc=/usr/lib/libc.a\, so=dll\, useshrplib=true\, libperl=libperl.a   gnulibc_version=''   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=dll\, d_dlsymun=undef\, ccdlflags=' -s -s'   cccdlflags=' -s'\, lddlflags=' --shared -Wl\,--enable-auto-import -Wl\,--export-all-symbols -Wl\,--stack\,8388608 -Wl\,--enable-auto-image-base -Wl\,--enable-auto-import -s -L/usr/local/lib'

Locally applied patches​:


@​INC for perl 5.10.0​:   /usr/local/lib/perl5/5.10.0/cygwin-thread-multi   /usr/local/lib/perl5/5.10.0   /usr/local/lib/perl5/site_perl/5.10.0/cygwin-thread-multi   /usr/local/lib/perl5/site_perl/5.10.0   .


Environment for perl 5.10.0​:   HOME=/home/garo   LANG (unset)   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=/usr/local/bin​:/usr/bin​:/bin​:/usr/X11R6/bin​:/usr/bin​:/cygdrive/c/Perl5.10/bin​:/cygdrive/c/WINDOWS/system32​:/cygdrive/c/WINDOWS​:/cygdrive/c/WINDOWS/System32/Wbem​:/cygdrive/c/Program Files/ATI Technologies/ATI Control Panel​:/cygdrive/c/Program Files/Microsoft SQL Server/90/Tools/binn/   PERL_BADLANG (unset)   SHELL (unset)

p5pRT commented 16 years ago

From g.psy.va@gmail.com

I'm sorry to forget to attach a test file. This is the most important.

p5pRT commented 16 years ago

From g.psy.va@gmail.com

#!/usr/bin/perl -w use strict; BEGIN{   package TSH; # Test for String Handler   use overload q{""} => 'stringify' ;

  sub import{   overload​::constant( 'q' => \&string_handler );   }

  sub string_handler   {   my($src\, $str\, $ctx) = @​_;   return bless \$str;   }   sub stringify{ "stringify(${$_[0]})"; } }

use Test​::More tests => 5;

my($v1\, $v2\, $v3\, $v4\, $v5); {   BEGIN{ TSH->import() }   $v1 = "foo";   $v2 = "foo\tbar";   $v3 = "\x41";   $v4 = "\"foo\"";   $v5 = '\'foo\''; }

isa_ok $v1\, 'TSH'\, '(success)'; isa_ok $v2\, 'TSH'\, '(failure)'; isa_ok $v3\, 'TSH'\, '(failure)'; isa_ok $v4\, 'TSH'\, '(success)'; isa_ok $v5\, 'TSH'\, '(success)'; use Data​::Dumper;

diag( Data​::Dumper->Dump([$v1\, $v2\, $v3\, $v4\, $v5]\, [qw(*v1 *v2 *v3 *v4 *v5)]) ); diag( '$v2 is expected to be an object reference like $v1.' );

p5pRT commented 16 years ago

From @schwern

Goro Fuji (via RT) wrote​:

I've found a bug on overload​::constant() with escaped characters (e.g. "\n"\, "\t")

Please try to do the attached test file\, "tsh.t".

I did it on perl 5.8.8 (for Cygwin)\, and 5.10.0 (for Windows and Cygwin)\, but the result is 'FAIL'. (however\, I can't find what is broken)

It's interesting that the stringification overloading is working\, in the sense that the strings got their assignments\, the scalars are just not identifying themselves as objects.

# This all passes. is $v1\, qq[stringify(foo)]; is $v2\, qq[stringify(foo\tbar)]; is $v3\, qq[stringify(\x41)]; is $v4\, qq[stringify("foo")]; is $v5\, qq[stringify('foo')];

Devel​::Peek shows the internal difference between $v1 and $v2.

  DB\<5> x Dump $v1 SV = PVMG(0x196a884) at 0x18d0c50   REFCNT = 2   FLAGS = (PADMY\,ROK)   IV = 0   NV = 0   RV = 0x172140   SV = PVMG(0x18d00b0) at 0x172140   REFCNT = 2   FLAGS = (PADMY\,OBJECT\,POK\,OVERLOAD\,pPOK)   IV = 0   NV = 0   PV = 0x6dd0b0 "foo"\0   CUR = 3   LEN = 4   STASH = 0x181cfa0 "TSH"   PV = 0x172140 ""   CUR = 0   LEN = 0   empty array

  DB\<4> x Dump $v2 SV = PV(0x1801bf8) at 0x1739f0   REFCNT = 2   FLAGS = (PADMY\,POK\,pPOK)   PV = 0x410450 "stringify(foo\tbar)"\0   CUR = 18   LEN = 20   empty array

For some reason the overloadedness of the constant string is not being transfered to the scalar when there's any sort of escaped sequences in the string. Even \060 (ascii 0 in octal) causes it.

Probably just causing the code to go down a different path that misses the overloading.

-- Insulting our readers is part of our business model.   http​://somethingpositive.net/sp07122005.shtml

p5pRT commented 16 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 12 years ago

From @doy

So I tracked this through the lexer\, and this is what I've come up with​:

When perl is reading a string\, the first pass doesn't handle interpolation. It first just reads the string verbatim (except that it unescapes all escaped delimiters)\, and then checks to see if it has any characters in that buffer which could require interpolation (either variable or escape sequence interpolation). If it does\, it goes through to try to break down the string and do whatever interpolation it can\, and then sticks a stringify op on top of that.

In the case of "$foo"\, this just builds an optree of stringify(padsv)\, and nothing else is done.

In the case of "foo $bar baz"\, this is broken down into "foo " . $bar . " baz"\, and the "foo " and " baz" bits are fed to overload​::constant. Then\, when that optree goes through the optimizer\, it sees stringify(concat(concat(const\, padsv)\, const))\, and realizes that stringify(concat(...)) is redundant\, and optimizes the stringify away (so the const ops are never modified from what the overload​::constant sub returned).

In the case of "foo\tbar"\, it reads through the string\, replaces \t with a literal tab\, and passes the resulting string to overload​::constant as before. Then\, when that optree goes through the optimizer\, it sees just stringify(const)\, which is able to be constant folded\, and so it does so\, leaving a const op containing the stringified form of whatever the overload​::constant sub returned.

I really have no idea where to go from here\, I've gotten to the point where I can read the lexer\, but not to the point where I'm capable of modifying it yet (without breaking things all over the place).

p5pRT commented 12 years ago

From @doy

Note that #101640 is a pretty close variant of this bug - that one is caused because for some reason that isn't quite clear to me\, the optimizer doesn't optimize away the stringify in stringify(concat(...)) in the very specific case of "my $x; $x = ..." where ... consists of more than one concatenation.