Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.94k stars 554 forks source link

Constant strings representing a number can BECOME numbers #6266

Closed p5pRT closed 14 years ago

p5pRT commented 21 years ago

Migrated from rt.perl.org#20661 (status was 'resolved')

Searchable as RT20661$

p5pRT commented 21 years ago

From perl-5.8.0@ton.iguana.be

Created by perl-5.8.0@ton.iguana.be

perl -wle 'sub fun { print+(shift() & "+0") eq "0" ? "yes" : "no"} fun("waf"); fun(0); fun("waf")' perl -wle 'sub fun { my $str = shift; print "$str​: "\, ($str & "+0") eq "0" ? "numeric" : "string"} fun("waf"); fun(0); fun("waf")' waf​: string 0​: numeric Argument "waf" isn't numeric in bitwise and (&) at -e line 1. waf​: numeric

This was code attempting to determine if a passed argument was "really" numeric at the perl level or not.

However\, it seems that once "+0" has interacted in the & with a number\, it starts to behave like the number 0\, so from then on on strings you get the warning and and everything looks like a number.

While this is "normal" behaviour for variables\, I don't think that constant strings like "+0" should do that.

Perl Info ``` Flags: category=core severity=low Site configuration information for perl v5.8.0: Configured by ton at Tue Nov 12 01:56:18 CET 2002. Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration: Platform: osname=linux, osvers=2.4.19, archname=i686-linux-thread-multi-64int-ld uname='linux quasar 2.4.19 #5 wed oct 2 02:34:25 cest 2002 i686 unknown ' config_args='' hint=recommended, useposix=true, d_sigaction=define usethreads=define use5005threads=undef useithreads=define usemultiplicity=define useperlio=define d_sfio=undef uselargefiles=define usesocks=undef use64bitint=define use64bitall=undef uselongdouble=define usemymalloc=y, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2 -fomit-frame-pointer', cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -I/usr/local/include' ccversion='', gccversion='2.95.3 20010315 (release)', gccosandvers='' intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12 ivtype='long long', ivsize=8, nvtype='long double', nvsize=12, Off_t='off_t', lseeksize=8 alignbytes=4, prototype=define Linker and Libraries: ld='cc', ldflags =' -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib libs=-lnsl -lndbm -ldb -ldl -lm -lpthread -lc -lposix -lcrypt -lutil perllibs=-lnsl -ldl -lm -lpthread -lc -lposix -lcrypt -lutil libc=/lib/libc-2.2.4.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.2.4' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic' cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib' Locally applied patches: @INC for perl v5.8.0: /usr/lib/perl5/5.8.0/i686-linux-thread-multi-64int-ld /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i686-linux-thread-multi-64int-ld /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl . Environment for perl v5.8.0: HOME=/home/ton LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/ton/bin.Linux:/home/ton/bin:/home/ton/bin.SampleSetup:/usr/local/bin:/usr/local/sbin:/usr/local/jre/bin:/home/oracle/product/9.0.1/bin:/usr/local/ar/bin:/usr/games/bin:/usr/X11R6/bin:/usr/share/bin:/usr/bin:/usr/sbin:/bin:/sbin:. PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 21 years ago

From @hvds

"perl-5.8.0@​ton.iguana.be (via RT)" \perlbug\-followup@​perl\.org wrote​: :perl -wle 'sub fun { print+(shift() & "+0") eq "0" ? "yes" : "no"} fun("waf"); fun(0); fun("waf")' :perl -wle 'sub fun { my $str = shift; print "$str​: "\, ($str & "+0") eq "0" ? "numeric" : "string"} fun("waf"); fun(0); fun("waf")' :waf​: string :0​: numeric :Argument "waf" isn't numeric in bitwise and (&) at -e line 1. :waf​: numeric : :This was code attempting to determine if a passed argument was "really" :numeric at the perl level or not. : :However\, it seems that once "+0" has interacted in the & with a number\, it :starts to behave like the number 0\, so from then on on strings you get the :warning and and everything looks like a number. : :While this is "normal" behaviour for variables\, I don't think that :constant strings like "+0" should do that.

Hmm\, tricky I think. In most cases\, caching conversion results for constants is a desirable thing to do.

In this case\, the easiest way to avoid it is not to be a constant​:   sub fun { my $z = "+0"; my $str = shift; print "$str​: "\, ($str & $z) eq "0" ? "numeric" :"string"}

What else is broken by upgrading constants in this way?

Hugo

p5pRT commented 21 years ago

From perl5-porters@ton.iguana.be

In article \200302120433\.h1C4XdX03020@​crypt\.compulink\.co\.uk\,   hv@​crypt.org writes​:

"perl-5.8.0@​ton.iguana.be (via RT)" \perlbug\-followup@​perl\.org wrote​: ​:perl -wle 'sub fun { print+(shift() & "+0") eq "0" ? "yes" : "no"} fun("waf"); fun(0); fun("waf")' ​:perl -wle 'sub fun { my $str = shift; print "$str​: "\, ($str & "+0") eq "0" ? "numeric" : "string"} fun("waf"); fun(0); fun("waf")' ​:waf​: string ​:0​: numeric ​:Argument "waf" isn't numeric in bitwise and (&) at -e line 1. ​:waf​: numeric ​: ​:This was code attempting to determine if a passed argument was "really" ​:numeric at the perl level or not. ​: ​:However\, it seems that once "+0" has interacted in the & with a number\, it ​:starts to behave like the number 0\, so from then on on strings you get the ​:warning and and everything looks like a number. ​: ​:While this is "normal" behaviour for variables\, I don't think that ​:constant strings like "+0" should do that.

Hmm\, tricky I think. In most cases\, caching conversion results for constants is a desirable thing to do.

Is it really ? How often would somebody write e.g.

$x += "1" ?

In this case\, the easiest way to avoid it is not to be a constant​: sub fun { my $z = "+0"; my $str = shift; print "$str​: "\, ($str & $z) eq "0" ? "numeric" :"string"}

What else is broken by upgrading constants in this way?

It probably only matters in the cases where the behaviour of string and number differ\, which is only the bitops I think. Since these are known for their iffy behaviour\, how about just doing the constant upgrade in general\, *except* when involved in a bitop ? If someone writes $var & "0"\, he probably *cares*.

Mmm\, can the constant conversion actually be done at compile time\, since it's already known there in what kind of operation the constant is involved ?

p5pRT commented 14 years ago

From @cpansprout

I believed the attached patch is self-explanatory.

p5pRT commented 14 years ago

From @cpansprout

Inline Patch ```diff diff -Nup blead/pp.c blead-20661-str-becomenig-num/pp.c --- blead/pp.c 2010-07-28 03:15:10.000000000 -0700 +++ blead-20661-str-becomenig-num/pp.c 2010-08-01 14:18:56.000000000 -0700 @@ -2387,6 +2387,10 @@ PP(pp_bit_and) { dPOPTOPssrl; if (SvNIOKp(left) || SvNIOKp(right)) { + const bool left_must_not_turn_into_a_number + = cBINOP->op_first->op_type == OP_CONST && !SvNIOKp(left); + const bool right_must_not_turn_into_a_number + = cBINOP->op_last->op_type == OP_CONST && !SvNIOKp(right); if (PL_op->op_private & HINT_INTEGER) { const IV i = SvIV_nomg(left) & SvIV_nomg(right); SETi(i); @@ -2395,6 +2399,8 @@ PP(pp_bit_and) const UV u = SvUV_nomg(left) & SvUV_nomg(right); SETu(u); } + if (left_must_not_turn_into_a_number) SvNIOK_off(left); + if (right_must_not_turn_into_a_number) SvNIOK_off(right); } else { do_vop(PL_op->op_type, TARG, left, right); @@ -2413,6 +2419,10 @@ PP(pp_bit_or) { dPOPTOPssrl; if (SvNIOKp(left) || SvNIOKp(right)) { + const bool left_must_not_turn_into_a_number + = cBINOP->op_first->op_type == OP_CONST && !SvNIOKp(left); + const bool right_must_not_turn_into_a_number + = cBINOP->op_last->op_type == OP_CONST && !SvNIOKp(right); if (PL_op->op_private & HINT_INTEGER) { const IV l = (USE_LEFT(left) ? SvIV_nomg(left) : 0); const IV r = SvIV_nomg(right); @@ -2425,6 +2435,8 @@ PP(pp_bit_or) const UV result = op_type == OP_BIT_OR ? (l | r) : (l ^ r); SETu(result); } + if (left_must_not_turn_into_a_number) SvNIOK_off(left); + if (right_must_not_turn_into_a_number) SvNIOK_off(right); } else { do_vop(op_type, TARG, left, right); diff -Nurp blead/t/op/bop.t blead-20661-str-becomenig-num/t/op/bop.t --- blead/t/op/bop.t 2009-11-19 08:51:40.000000000 -0800 +++ blead-20661-str-becomenig-num/t/op/bop.t 2010-08-01 14:06:50.000000000 -0700 @@ -15,7 +15,7 @@ BEGIN { # If you find tests are failing, please try adding names to tests to track # down where the failure is, and supply your new names as a patch. # (Just-in-time test naming) -plan tests => 161 + (10*13*2) + 4; +plan tests => 170 + (10*13*2) + 4; # numerics ok ((0xdead & 0xbeef) == 0x9ead); @@ -63,6 +63,20 @@ is (($foo | $bar), ($Aoz x 75 . $zap)); # ^ does not truncate is (($foo ^ $bar), ($Axz x 75 . $zap)); +# string constants +sub _and($) { $_[0] & "+0" } +sub _oar($) { $_[0] | "+0" } +sub _xor($) { $_[0] ^ "+0" } +is _and "waf", '# ', 'str var & const str'; # These three +is _and 0, '0', 'num var & const str'; # are from +is _and "waf", '# ', 'str var & const str again'; # [perl #20661] +is _oar "yit", '{yt', 'str var | const str'; +is _oar 0, '0', 'num var | const str'; +is _oar "yit", '{yt', 'str var | const str again'; +is _xor "yit", 'RYt', 'str var ^ const str'; +is _xor 0, '0', 'num var ^ const str'; +is _xor "yit", 'RYt', 'str var ^ const str again'; + # is ("ok \xFF\xFF\n" & "ok 19\n", "ok 19\n"); is ("ok 20\n" | "ok \0\0\n", "ok 20\n"); ```
p5pRT commented 14 years ago

From @cpansprout

On Aug 1\, 2010\, at 2​:22 PM\, Father Chrysostomos wrote​:

I believed the attached patch is self-explanatory. \<open_nKvdUEYU.txt>

Someone pointed out to me that it would likely be reviewed more quickly if I provided a commit message\, so here it is​:

This patch solves the problem of $x & "+0" not treating the RHS as a string if\, on a previous invocation\, the LHS happened to be a number (similarly with the other bitwise ops\, too). This patch takes the conservative approach of fixing *just* those cases that have explicit quotation marks in the source code\, which are clearly broken (and also ‘use constant’-style string constants\, which are indistinguishable). (Read-only variables are slightly controversial still\, and my patch does not affect those.)

It does this by making the appropriate pp_ funcitons look at the op tree to see whether either operand is a constant during a numeric bitwise operation. If it is\, and it is not numeric\, it turns off the numericness (numericality?) before returning.

p5pRT commented 14 years ago

From @cpansprout

On Wed Feb 12 14​:13​:32 2003\, perl5-porters@​ton.iguana.be wrote​:

It probably only matters in the cases where the behaviour of string and number differ\, which is only the bitops I think. Since these are known for their iffy behaviour\, how about just doing the constant upgrade in general\, *except* when involved in a bitop ? If someone writes $var & "0"\, he probably *cares*.

Commit b20c4ee1f stops bitops from coercing read-only arguments.

p5pRT commented 14 years ago

@cpansprout - Status changed from 'open' to 'resolved'