Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.98k stars 560 forks source link

tr/// against $1 causes persistent data #1964

Closed p5pRT closed 18 years ago

p5pRT commented 24 years ago

Migrated from rt.perl.org#3237 (status was 'resolved')

Searchable as RT3237$

p5pRT commented 24 years ago

From rootbeer@redcat.com

This is a bug report for perl from rootbeer@​redcat.com\, generated with the help of perlbug 1.28 running under perl v5.6.0.


The substitution near the end of this chunk of code should change $_ from 'fred' to 'Fred'\, but a previous value from $1 leaks in as well. This seems to happen only when $1 has been used in a tr///-for-counting operation.

  $_ = "fred";   /([a-z]{2})/;   $1 =~ tr/A-Z//;   print; # prints "fred"   s/^(\s*)f/$1F/;   print; # prints "frFred"

It's entertaining to run this to completion in the debugger (not stepping through it\, but just letting it run) then to read the "Debugged program terminated" message. Okay\, so it doesn't take much to amuse me. :-D



Flags​:   category=core   severity=medium


Site configuration information for perl v5.6.0​:

Configured by rootbeer at Sun Apr 23 14​:09​:13 PDT 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 0) configuration​:   Platform​:   osname=linux\, osvers=2.0.35\, archname=i586-linux   uname='linux localhost.localdomain 2.0.35 #1 tue jul 14 23​:56​:39 edt 1998 i586 unknown '   config_args=''   hint=recommended\, useposix=true\, d_sigaction=define   usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef   useperlio=undef d_sfio=undef uselargefiles=define   use64bitint=undef use64bitall=undef uselongdouble=undef usesocks=undef   Compiler​:   cc='cc'\, optimize='-g -DDEBUGGING -O2'\, gccversion=2.7.2.3   cppflags=''   ccflags =' -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'   stdchar='char'\, d_stdstdio=define\, usevfork=false   intsize=4\, longsize=4\, ptrsize=4\, doublesize=8   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=12   ivtype='long'\, ivsize=4\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=4   alignbytes=4\, usemymalloc=n\, prototype=define   Linker and Libraries​:   ld='cc'\, ldflags =' -L/usr/local/lib'   libpth=/usr/local/lib /lib /usr/lib   libs=-lnsl -lndbm -lgdbm -ldb -ldl -lm -lc -lposix -lcrypt   libc=/lib/libc-2.0.7.so\, so=so\, useshrplib=false\, libperl=libperl.a   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-rdynamic'   cccdlflags='-fpic'\, lddlflags='-shared -L/usr/local/lib'

Locally applied patches​:  


@​INC for perl v5.6.0​:   /usr/lib/perl5/5.6.0/i586-linux   /usr/lib/perl5/5.6.0   /usr/lib/perl5/site_perl/5.6.0/i586-linux   /usr/lib/perl5/site_perl/5.6.0   /usr/lib/perl5/site_perl   .


Environment for perl v5.6.0​:   HOME=/home/rootbeer   LANG (unset)   LANGUAGE (unset)   LD_LIBRARY_PATH (unset)   LOGDIR (unset)   PATH=/usr/local/bin​:/bin​:/usr/bin​:/usr/X11R6/bin​:/usr/games​:/home/rootbeer/bin   PERL_BADLANG (unset)   SHELL=/usr/local/bin/bash

p5pRT commented 24 years ago

From @vanstyn

In \Pine\.GSO\.4\.10\.10005111759110\.16364\-100000@​user2\.teleport\.com\, Tom Phoenix writes​: :This is a bug report for perl from rootbeer@​redcat.com\, :generated with the help of perlbug 1.28 running under perl v5.6.0. : :----------------------------------------------------------------- : :The substitution near the end of this chunk of code should change $_ from :'fred' to 'Fred'\, but a previous value from $1 leaks in as well. This :seems to happen only when $1 has been used in a tr///-for-counting :operation. : : $_ = "fred"; : /([a-z]{2})/; : $1 =~ tr/A-Z//; : print; # prints "fred" : s/^(\s*)f/$1F/; : print; # prints "frFred"

This appears to occur because the tr/// sets POK on $1\, which bypasses the magic get that should happen in the substitution. The culprit is this line from doop.c​:595 in Perl_do_trans()​:   (void)SvPOK_only(sv);

I'm not sure quite why this line exists\, since removing it does not trigger any test failures\, but I suspect that it should occur only if the SV does not have a magic get (as in the attached patch). If someone can confirm or correct this belief\, I'll aim to add appropriate test cases.

Hugo

Inline Patch ```diff --- doop.c.old Sun Mar 12 03:36:32 2000 +++ doop.c Fri May 12 10:41:04 2000 @@ -592,7 +592,8 @@ return 0; if (!SvPOKp(sv)) (void)SvPV_force(sv, len); - (void)SvPOK_only(sv); + if (!SvGMAGICAL(sv)) + (void)SvPOK_only(sv); DEBUG_t( Perl_deb(aTHX_ "2.TBL\n")); ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

Hugo \hv@​crypt\.compulink\.co\.uk wrote

This appears to occur because the tr/// sets POK on $1\, which bypasses the magic get that should happen in the substitution. The culprit is this line from doop.c​:595 in Perl_do_trans()​: (void)SvPOK_only(sv);

I'm not sure quite why this line exists\, since removing it does not trigger any test failures\, but I suspect that it should occur only if the SV does not have a magic get (as in the attached patch).

It exists to ensure that the attached tests pass. :-)

And I think the "correct" fix is to avoid the SvPOK_only in the non-updating case. Compare with the read-only test just above.

Patch for 5.6.0 and new tests attached.

And I've added comments to the new tests - perhaps that'll start a trend. :-)

Mike Guy

Inline Patch ```diff --- ./t/op/tr.t.orig Fri May 12 17:28:34 2000 +++ ./t/op/tr.t Fri May 12 17:34:18 2000 @@ -5,7 +5,7 @@ unshift @INC, "../lib"; } -print "1..4\n"; +print "1..6\n"; $_ = "abcdefghijklmnopqrstuvwxyz"; @@ -37,3 +37,19 @@ print "ok 4\n"; } # + +# make sure that tr cancels IOK and NOK +($x = 12) =~ tr/1/3/; +(my $y = 12) =~ tr/1/3/; +($f = 1.5) =~ tr/1/3/; +(my $g = 1.5) =~ tr/1/3/; +print "not " unless $x + $y + $f + $g == 71; +print "ok 5\n"; + +# make sure tr is harmless if not updating - see [ID 20000511.005] +$_ = 'fred'; +/([a-z]{2})/; +$1 =~ tr/A-Z//; +s/^(\s*)f/$1F/; +print "not " if $_ ne 'Fred'; +print "ok 6\n"; --- ./doop.c.orig Fri May 12 15:41:40 2000 +++ ./doop.c Fri May 12 17:40:21 2000 @@ -592,7 +592,8 @@ return 0; if (!SvPOKp(sv)) (void)SvPV_force(sv, len); - (void)SvPOK_only(sv); + if (!(PL_op->op_private & OPpTRANS_IDENTICAL)) + (void)SvPOK_only(sv); DEBUG_t( Perl_deb(aTHX_ "2.TBL\n")); End of patch ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

I wrote

And I think the "correct" fix is to avoid the SvPOK_only in the non-updating case.

For some values of "correct" only. This second try ensures that the UTF8 flag doesn't get lost. (But tr/// on UTF8 values is broken in other ways. And the new tests are somewhat curiously written to circumvent UTF8 bugs in eq and other operators. But that's all the subject of another thread.)

I note that Robin Barker\, in his recent patch to vec()\, uses SvNIOK_off rather than SvPOK_only or SvPOK_only_UTF8. I'm unclear which is more correct. I can't find any case which fails with my test [*]; in partticular tied values are treated correctly although I don't understand why the magic isn't getting lost.

[*] Though I did find some strange examples of "correct" behaviour.   Can you predict what

  ($! = 28) =~ tr/2/3/

  will do?

Mike Guy

Inline Patch ```diff --- ./t/op/tr.t.orig Fri May 12 19:06:02 2000 +++ ./t/op/tr.t Sat May 13 09:48:39 2000 @@ -5,7 +5,7 @@ unshift @INC, "../lib"; } -print "1..4\n"; +print "1..8\n"; $_ = "abcdefghijklmnopqrstuvwxyz"; @@ -37,3 +37,27 @@ print "ok 4\n"; } # + +# make sure that tr cancels IOK and NOK +($x = 12) =~ tr/1/3/; +(my $y = 12) =~ tr/1/3/; +($f = 1.5) =~ tr/1/3/; +(my $g = 1.5) =~ tr/1/3/; +print "not " unless $x + $y + $f + $g == 71; +print "ok 5\n"; + +# make sure tr is harmless if not updating - see [ID 20000511.005] +$_ = 'fred'; +/([a-z]{2})/; +$1 =~ tr/A-Z//; +s/^(\s*)f/$1F/; +print "not " if $_ ne 'Fred'; +print "ok 6\n"; + +# check tr handles UTF8 correctly +($x = 256.65.258) =~ tr/a/b/; +print "not " if $x ne 256.65.258 or length $x != 3; +print "ok 7\n"; +$x =~ tr/A/B/; +print "not " if $x ne 256.66.258 or length $x != 3; +print "ok 8\n"; --- ./doop.c.orig Fri May 12 19:06:03 2000 +++ ./doop.c Sat May 13 09:54:44 2000 @@ -592,7 +592,8 @@ return 0; if (!SvPOKp(sv)) (void)SvPV_force(sv, len); - (void)SvPOK_only(sv); + if (!(PL_op->op_private & OPpTRANS_IDENTICAL)) + (void)SvPOK_only_UTF8(sv); DEBUG_t( Perl_deb(aTHX_ "2.TBL\n")); End of patch ```
p5pRT commented 24 years ago

From [Unknown Contact. See original ticket]

M.J.T. Guy writes​:

 Can you predict what
        \($\! = 28\) =~ tr/2/3/
 will do?

Lemme try to predict what it *should* do. $! is assignable as a number only\, but accessible any way. So tr/// is performed on the message of errno==28. But then the converted message is either assigned back\, or (as an optimization) the assignment is skipped (since most probably there is no '2' in the message).

In the first case numerical value of the message (thus 0) is put in $!. In the second case tr/// is a NOP.

Checking... I get

  Argument "Disk full" isn't numeric in trans at -e line 1.

So there is no skip-assignment optimization...

Ilya

p5pRT commented 18 years ago

From @smpeters

[RT_System - Thu May 11 20​:40​:52 2000]​:

In \Pine\.GSO\.4\.10\.10005111759110\.16364\-100000@​user2\.teleport\.com\, Tom Phoenix writes​: :This is a bug report for perl from rootbeer@​redcat.com\, :generated with the help of perlbug 1.28 running under perl v5.6.0. : :----------------------------------------------------------------- : :The substitution near the end of this chunk of code should change $_ from :'fred' to 'Fred'\, but a previous value from $1 leaks in as well. This :seems to happen only when $1 has been used in a tr///-for-counting :operation. : : $_ = "fred"; : /([a-z]{2})/; : $1 =~ tr/A-Z//; : print; # prints "fred" : s/^(\s*)f/$1F/; : print; # prints "frFred"

This appears to occur because the tr/// sets POK on $1\, which bypasses the magic get that should happen in the substitution. The culprit is this line from doop.c​:595 in Perl_do_trans()​: (void)SvPOK_only(sv);

I'm not sure quite why this line exists\, since removing it does not trigger any test failures\, but I suspect that it should occur only if the SV does not have a magic get (as in the attached patch). If someone can confirm or correct this belief\, I'll aim to add appropriate test cases.

Hugo --- doop.c.old Sun Mar 12 03​:36​:32 2000 +++ doop.c Fri May 12 10​:41​:04 2000 @​@​ -592\,7 +592\,8 @​@​ return 0; if (!SvPOKp(sv)) (void)SvPV_force(sv\, len); - (void)SvPOK_only(sv); + if (!SvGMAGICAL(sv)) + (void)SvPOK_only(sv);

 DEBUG\_t\( Perl\_deb\(aTHX\_ "2\.TBL\\n"\)\);

It looks like this patch or the other patch in this ticket thread were never applied. Instead\, the following change seemed to take care of things.

Change 17984 by rgs@​rgs-home on 2002/10/09 19​:17​:08

  Fix bug #17823 : non-modifying tr/// stringifies references

Affected files ...

... //depot/perl/doop.c#129 edit ... //depot/perl/t/op/tr.t#33 edit

Differences ...

==== //depot/perl/doop.c#129 (text) ====

@​@​ -608\,10 +608\,11 @​@​   (void)SvPV(sv\, len);   if (!len)   return 0; - if (!SvPOKp(sv)) - (void)SvPV_force(sv\, len); - if (!(PL_op->op_private & OPpTRANS_IDENTICAL)) + if (!(PL_op->op_private & OPpTRANS_IDENTICAL)) { + if (!SvPOKp(sv)) + (void)SvPV_force(sv\, len);   (void)SvPOK_only_UTF8(sv); + }

  DEBUG_t( Perl_deb(aTHX_ "2.TBL\n"));

==== //depot/perl/t/op/tr.t#33 (xtext) ====

@​@​ -6\,7 +6\,7 @​@​   require './test.pl'; }

-plan tests => 97; +plan tests => 99;

my $Is_EBCDIC = (ord('i') == 0x89 & ord('J') == 0xd1);

@​@​ -379\,3 +379\,7 @​@​ eval '$foo{bar} =~ tr/N/N/'; is( $@​\, ''\, 'implicit count outside hash bounds' ); is( scalar keys %foo\, 0\, " doesn't extend the hash"); + +$x = \"foo"; +is( $x =~ tr/A/A/\, 2\, 'non-modifying tr/// on a scalar ref' ); +is( ref $x\, 'SCALAR'\, " doesn't stringify its argument" );

p5pRT commented 18 years ago

@smpeters - Status changed from 'open' to 'resolved'