Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 531 forks source link

peephole optimiser could prune more dead code #10481

Closed p5pRT closed 4 years ago

p5pRT commented 14 years ago

Migrated from rt.perl.org#76438 (status was 'open')

Searchable as RT76438$

p5pRT commented 14 years ago

From @nwc10

Created by @nwc10

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}' '???'; -e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}' if ($a and !1) {   print $_; } -e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded\, but that the optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing\, as it can't know that $a is neither tied nor overloaded\, so cannot assume that the lookup of $a has no side effects.

However\, it can know that the conditional to the if block is always false\, and so could optimise away the ops for the block\, freeing up their memory. Hence the code should become

  $a and !1;

or even the perl equivalent of

  (void) (bool) $a;

Wishlist\, because I've no idea how much real world perl code ends up with constructions like this\, and would benefit.

Nicholas Clark

Perl Info ``` Flags: category=core severity=wishlist Site configuration information for perl 5.13.2: Configured by nick at Fri Jul 9 10:52:23 BST 2010. Summary of my perl5 (revision 5 version 13 subversion 2) configuration: Derived from: a2d3de138935fbe8f4190ee9176b8fdd812a91d5 Platform: osname=linux, osvers=2.6.18.8-xenu, archname=x86_64-linux uname='linux eris 2.6.18.8-xenu #1 smp sat oct 3 10:27:42 bst 2009 x86_64 gnulinux ' config_args='-Dusedevel=y -Dcc=ccache gcc -Dld=gcc -Ubincompat5005 -Uinstallusrbinperl -Dcf_email=nick@ccl4.org -Dperladmin=nick@ccl4.org -Dinc_version_list= -Dinc_version_list_init=0 -Doptimize=-Os -Uusethreads -Duse64bitall -Uusemymalloc -Duseperlio -Dprefix=~/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1 -Uusevendorprefix -Uvendorprefix=~/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1 -Dinstallman1dir=none -Dinstallman3dir=none -Uuserelocatableinc -Accccflags=-DPERL_OLD_COPY_ON_WRITE -de' hint=recommended, useposix=true, d_sigaction=define useithreads=undef, usemultiplicity=undef useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='ccache gcc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-Os', cppflags='-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.3.2', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='gcc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /lib /usr/lib /lib64 /usr/lib64 libs=-lnsl -ldb -ldl -lm -lcrypt -lutil -lc perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc libc=/lib/libc-2.7.so, so=so, useshrplib=false, libperl=libperl.a gnulibc_version='2.7' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E' cccdlflags='-fPIC', lddlflags='-shared -Os -L/usr/local/lib -fstack-protector' Locally applied patches: @INC for perl 5.13.2: lib /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/site_perl/5.13.2/x86_64-linux /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/site_perl/5.13.2 /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/5.13.2/x86_64-linux /home/nick/Sandpit/snap5.9.x-v5.13.2-220-ga2d3de1/lib/perl5/5.13.2 . Environment for perl 5.13.2: HOME=/home/nick LANG (unset) LANGUAGE (unset) LD_LIBRARY_PATH (unset) LOGDIR (unset) PATH=/home/nick/bin:/usr/local/bin:/usr/bin:/bin:/usr/games:/usr/local/sbin:/sbin:/usr/sbin PERL_BADLANG (unset) SHELL=/bin/bash ```
p5pRT commented 14 years ago

From james@mastros.biz

On 9 July 2010 16​:56\, Nicholas Clark \perlbug\-followup@​perl\.org wrote​:

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}' '???'; -e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}' if ($a and !1) {    print $_; } -e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded\, but that the optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing\, as it can't know that $a is neither tied nor overloaded\, so cannot assume that the lookup of $a has no side effects.

However\, it can know that the conditional to the if block is always false\, and so could optimise away the ops for the block\, freeing up their memory. Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Wishlist\, because I've no idea how much real world perl code ends up with constructions like this\, and would benefit

I do wonder\, sometimes\, if we worry entirely too much about just when tie and overload calls or done. Would it break actual real-world code to not retrieve the value of $a (tie) or not boolify it (overload) when the value will be thrown away anyway? Clearly\, we can't do this in a maintance release\, but perhaps we should add warnings that we are planning on doing it to 5.14.0? It seems to me that doing this would allow all sorts of optimizations that we currently think of\, and then say "that'd change overloading"\, and throw out\, with very little impact on real-world code\, which either doesn't use overloading\, or would be happy if overloading were made faster by avoiding it where possible.

  -=- James Mastros / theorbtwo

p5pRT commented 14 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 14 years ago

From @jbenjore

On Fri\, Jul 9\, 2010 at 8​:56 AM\, Nicholas Clark \perlbug\-followup@​perl\.org wrote​:

# New Ticket Created by  Nicholas Clark # Please include the string​:  [perl #76438] # in the subject line of all future correspondence about this issue. # \<URL​: http​://rt.perl.org/rt3/Ticket/Display.html?id=76438 >

This is a bug report for perl from nick@​ccl4.org\, generated with the help of perlbug 1.39 running under perl 5.13.2.

----------------------------------------------------------------- [Please describe your issue here]

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}' '???'; -e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}' if ($a and !1) {    print $_; } -e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded\, but that the optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing\, as it can't know that $a is neither tied nor overloaded\, so cannot assume that the lookup of $a has no side effects.

However\, it can know that the conditional to the if block is always false\, and so could optimise away the ops for the block\, freeing up their memory. Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Well actually\, the 'bool' call was in scalar context\, not void.

Josh

p5pRT commented 14 years ago

From @demerphq

On 10 July 2010 13​:36\, James Mastros \james@&#8203;mastros\.biz wrote​:

On 9 July 2010 16​:56\, Nicholas Clark \perlbug\-followup@&#8203;perl\.org wrote​:

$ ./perl -Ilib -MO=Deparse -e 'if ("Pie" eq "Good") {print}' '???'; -e syntax OK

but

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}' if ($a and !1) {    print $_; } -e syntax OK

which demonstrates that "Pie" eq "Good" is constant folded\, but that the optree for the block still exists.

The peephole optimiser is correct not to optimise this to nothing\, as it can't know that $a is neither tied nor overloaded\, so cannot assume that the lookup of $a has no side effects.

However\, it can know that the conditional to the if block is always false\, and so could optimise away the ops for the block\, freeing up their memory. Hence the code should become

   $a and !1;

or even the perl equivalent of

   (void) (bool) $a;

Wishlist\, because I've no idea how much real world perl code ends up with constructions like this\, and would benefit

I do wonder\, sometimes\, if we worry entirely too much about just when tie and overload calls or done.  Would it break actual real-world code to not retrieve the value of $a (tie) or not boolify it (overload) when the value will be thrown away anyway?  Clearly\, we can't do this in a maintance release\, but perhaps we should add warnings that we are planning on doing it to 5.14.0?  It seems to me that doing this would allow all sorts of optimizations that we currently think of\, and then say "that'd change overloading"\, and throw out\, with very little impact on real-world code\, which either doesn't use overloading\, or would be happy if overloading were made faster by avoiding it where possible.

This came up in another thread. JIT compilation techniques combined with smaller less generic ops would give us the opportunity to rewrite this as a more efficient structure.

I also proposed a "no magic" pragma/syntax that would allow the optimiser to assume that all variables were "normal"\, and that funky stuff wasnt going to occur during the scope of the pragma. And in such a block i would expect this to block to be optimised away.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @Leont

On Sat\, Jul 10\, 2010 at 4​:17 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

I also proposed a "no magic" pragma/syntax that would allow the optimiser to assume that all variables were "normal"\, and that funky stuff wasnt going to occur during the scope of the pragma. And in such a block i would expect this to block to be optimised away.

The use of magic is too pervasive for that. Not only because many special variables use active magic ($!\, $1\, %ENV\, %SIG\, etc…) but also autovivication\, m//g state\, tainting and utf8 caching\, $#array\, pos()\, lvalue substr()\, scalar(keys) and a number of more obscure things.

I don't think it's workable.

Leon

p5pRT commented 14 years ago

From ben@morrow.me.uk

Quoth fawaka@​gmail.com (Leon Timmermans)​:

On Sat\, Jul 10\, 2010 at 4​:17 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

I also proposed a "no magic" pragma/syntax that would allow the optimiser to assume that all variables were "normal"\, and that funky stuff wasnt going to occur during the scope of the pragma. And in such a block i would expect this to block to be optimised away.

The use of magic is too pervasive for that. Not only because many special variables use active magic ($!\, $1\, %ENV\, %SIG\, etc…) but also autovivication\, m//g state\, tainting and utf8 caching\, $#array\, pos()\, lvalue substr()\, scalar(keys) and a number of more obscure things.

I don't think it's workable.

Most of those forms of magic don't have meaningful side-effects\, though\, at least on mg_get. (That is\, they may have side-effects\, but it doesn't matter if they aren't invoked. Tainting is the obvious exception.) This is what really matters from the pov of optimisation.

Ben

p5pRT commented 14 years ago

From @Leont

On Sat\, Jul 10\, 2010 at 9​:16 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Quoth fawaka@​gmail.com (Leon Timmermans)​: Most of those forms of magic don't have meaningful side-effects\, though\, at least on mg_get. (That is\, they may have side-effects\, but it doesn't matter if they aren't invoked. Tainting is the obvious exception.) This is what really matters from the pov of optimisation.

If you're only ignoring get magic and not set magic\, I guess it's possible if you special-case special variables with get magic such as $!. @​_ elements should probably also be exempt or else passing a tied argument will break very confusingly. I'm not entirely sure what the consequences of ignoring substr\, pos and $#array and autovivication get magic. I suspect they'll work in the common case\, but will break in corner cases.

Leon

p5pRT commented 14 years ago

From @druud62

James Mastros wrote​:

I do wonder\, sometimes\, if we worry entirely too much about just when tie and overload calls or done. Would it break actual real-world code to not retrieve the value of $a (tie) or not boolify it (overload) when the value will be thrown away anyway? Clearly\, we can't do this in a maintance release\, but perhaps we should add warnings that we are planning on doing it to 5.14.0? It seems to me that doing this would allow all sorts of optimizations that we currently think of\, and then say "that'd change overloading"\, and throw out\, with very little impact on real-world code\, which either doesn't use overloading\, or would be happy if overloading were made faster by avoiding it where possible.

  no tie;

  no overload;

or

  use optimize qw( :no_overload :no_tie );

?

-- Ruud

p5pRT commented 14 years ago

From ben@morrow.me.uk

Quoth rvtol+usenet@​isolution.nl ("Dr.Ruud")​:

James Mastros wrote​:

I do wonder\, sometimes\, if we worry entirely too much about just when tie and overload calls or done. Would it break actual real-world code to not retrieve the value of $a (tie) or not boolify it (overload) when the value will be thrown away anyway? Clearly\, we can't do this in a maintance release\, but perhaps we should add warnings that we are planning on doing it to 5.14.0? It seems to me that doing this would allow all sorts of optimizations that we currently think of\, and then say "that'd change overloading"\, and throw out\, with very little impact on real-world code\, which either doesn't use overloading\, or would be happy if overloading were made faster by avoiding it where possible.

no tie;

no overload;

or

use optimize qw( :no_overload :no_tie );

Obviously\, this should be

  use less "magic";

:)

Ben

p5pRT commented 14 years ago

From james@mastros.biz

On 11 July 2010 00​:25\, Leon Timmermans \fawaka@&#8203;gmail\.com wrote​:

On Sat\, Jul 10\, 2010 at 9​:16 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Quoth fawaka@​gmail.com (Leon Timmermans)​: Most of those forms of magic don't have meaningful side-effects\, though\, at least on mg_get. (That is\, they may have side-effects\, but it doesn't matter if they aren't invoked. Tainting is the obvious exception.) This is what really matters from the pov of optimisation.

If you're only ignoring get magic and not set magic\, I guess it's possible if you special-case special variables with get magic such as $!. @​_ elements should probably also be exempt or else passing a tied argument will break very confusingly. I'm not entirely sure what the consequences of ignoring substr\, pos and $#array and autovivication get magic. I suspect they'll work in the common case\, but will break in corner cases.

I think what Nick was getting at\, and certainly what I was getting at\, was not that we should bypass get magic\, but rather that\, in marked blocks\, it should be acceptable to not call get magic *when the output is not relevant*\, and to cache the value of the get magic during that lexical scope\, each execution. That is\, we assume that values act like values\, and not like hidden accessors. $! would still work\, so long as we don't look at $! twice and expect it to change. That is​:

  {   use less 'magic'; # best name I've seen\, we aren't not using it\, just using less of it.   if ($!) {   die "Rock me\, Amadeus​: $!"   }   }

is OK. We invoke the get magic\, once\, and assume the value hasn't changed by the second time we want it\, possibly\, but it shouldn't anyway.

@​_ elements is probably a matter of "don't do that\, then". However\, it's only a problem if the thing passed in has get magic that make it not act like a normal value during the period the pragma is in effect.

  -=- James Mastros / theorbtwo

p5pRT commented 14 years ago

From @nwc10

On Sun\, Jul 11\, 2010 at 10​:35​:05AM +0100\, James Mastros wrote​:

I think what Nick was getting at\, and certainly what I was getting at\, was not that we should bypass get magic\, but rather that\, in marked blocks\, it should be acceptable to not call get magic *when the output is not relevant*\, and to cache the value of the get magic during that lexical scope\, each execution. That is\, we assume that values act like values\, and not like hidden accessors. $! would still work\, so long as we don't look at $! twice and expect it to change. That is​:

No\, that's not what *I* was getting at. That's the entire thread that has gone sideways from what I originally reported. What *I* reported was that​:

$ ./perl -Ilib -MO=Deparse -e 'if ($a && "Pie" eq "Good") {print}' if ($a and !1) {   print $_; } -e syntax OK

ie there is provably dead code still in the optree - the print statement.

(Which could be removed by the optimiser\, without any change to any semantic of the language. ie it's 100% safe)

Nicholas Clark

p5pRT commented 14 years ago

From @ikegami

On Sun\, Jul 11\, 2010 at 5​:35 AM\, James Mastros \james@&#8203;mastros\.biz wrote​:

use less 'magic'; # best name I've seen\, we aren't not using it\, just using less of it.

We're not instructing Perl to be less magical\, we're promising Perl we won't be using magic.

p5pRT commented 14 years ago

From @xdg

On Sat\, Jul 10\, 2010 at 7​:36 AM\, James Mastros \james@&#8203;mastros\.biz wrote​:

I do wonder\, sometimes\, if we worry entirely too much about just when tie and overload calls or done.  Would it break actual real-world code to not retrieve the value of $a (tie) or not boolify it (overload) when the value will be thrown away anyway?

I think explicitly clarifying that side effects (like magic) will not happen if the compiler optimizes a block or expression is a good idea.

Then maybe we could use the less pragma when that isn't desired. ( e.g. "use less 'optimization'" )

-- David

p5pRT commented 14 years ago

From @ikegami

On Sun\, Jul 11\, 2010 at 3​:40 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

Then maybe we could use the less pragma when that isn't desired. ( e.g. "use less 'optimization'"

Sounds great to me\, except there are backward compatibility issues to defaulting to aggressive optimisation.

p5pRT commented 14 years ago

From @jbenjore

On Sun\, Jul 11\, 2010 at 2​:56 PM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 3​:40 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

Then maybe we could use the less pragma when that isn't desired.  ( e.g. "use less 'optimization'"

Sounds great to me\, except there are backward compatibility issues to defaulting to aggressive optimisation.

Sounds like you want some instrumentation for where your magic is actually happening at.

Josh

p5pRT commented 14 years ago

From @xdg

On Sun\, Jul 11\, 2010 at 5​:58 PM\, Joshua ben Jore \twists@&#8203;gmail\.com wrote​:

On Sun\, Jul 11\, 2010 at 2​:56 PM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 3​:40 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

Then maybe we could use the less pragma when that isn't desired.  ( e.g. "use less 'optimization'"

Sounds great to me\, except there are backward compatibility issues to defaulting to aggressive optimisation.

Sounds like you want some instrumentation for where your magic is actually happening at.

(Insert xdg rant on backwards compatibility)

I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Whether any particular optimization is "worth it" is open for later debate\, but at least the door would be open.

Notwithstanding that Nicholas points out that this example isn't what he was talking about\, the question that has been raised is whether we could just short-circuit an *entire* logical operation if "static" analysis can determine whether it is true or false.

Effectively​:

  if ($a && 0) { ... } # could be optimized away entirely

I don't know how much code *relies* on something like $a being tied and getting evaluated first in a logic operation. Likewise\, I don't know how much code actually relies on something like C\<\< && 0 >>. The only thing that comes to mind is C\<\< && DEBUG >> where DEBUG is a constant.

  warn "..." if $something && DEBUG;

-- David

p5pRT commented 14 years ago

From @ikegami

On Sun\, Jul 11\, 2010 at 10​:22 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Without that guarantee\,

my $x = f()   or DEBUG && warn(...); return $x;

would be buggy. Dunno if that matters

p5pRT commented 14 years ago

From @xdg

On Mon\, Jul 12\, 2010 at 12​:51 AM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 10​:22 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Without that guarantee\,

my $x = f()    or DEBUG && warn(...); return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has a side effect (functional call) in an assignment before the logic expression. Maybe "evaluated" isn't the right term\, but I was intending it to mean the action of reading a value from a variable.

David

p5pRT commented 14 years ago

From @demerphq

On 12 July 2010 13​:07\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

On Mon\, Jul 12\, 2010 at 12​:51 AM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 10​:22 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Without that guarantee\,

my $x = f()    or DEBUG && warn(...); return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has a side effect (functional call) in an assignment before the logic expression.  Maybe "evaluated" isn't the right term\, but I was intending it to mean the action of reading a value from a variable.

I think this is close to something i mentioned.

My thought was​: given that

$b=$a++ + $a++;

is not defined\, that we could also assume that changing fetch magic inside of fetch magic would only take effect after that statement concluded\, and thus

$b=$a + $a;

where $a is tied/overloaded and the magic changes on invocation of the fetch is also undefined.

Thus we could check for magic at the beginning of the expression\, and then cache it for the duration\, although we would guarantee that the magic WAS called twice if there was magic.

But the problem with any of these changes is that it could/would break stuff somewhere. Which is why i figured some kind of compiler hint was in order\, as it would mean that new code could be optimised relatively sanely and old code would continue on unbroken.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @demerphq

On 12 July 2010 13​:36\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 12 July 2010 13​:07\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

On Mon\, Jul 12\, 2010 at 12​:51 AM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 10​:22 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Without that guarantee\,

my $x = f()    or DEBUG && warn(...); return $x;

would be buggy. Dunno if that matters

I said "... have side effects when evaluated ..." but your example has a side effect (functional call) in an assignment before the logic expression.  Maybe "evaluated" isn't the right term\, but I was intending it to mean the action of reading a value from a variable.

I think this is close to something i mentioned.

My thought was​: given that

$b=$a++ + $a++;

is not defined\, that we could also assume that changing fetch magic inside of fetch magic would only take effect after that statement concluded\, and thus

$b=$a + $a;

where $a is tied/overloaded and the magic changes on invocation of the fetch is also undefined.

Gah. That came out all wrong.

I mean​: given that $b=$a++ + $a++; is undefined\, that is that an expression using a mutator on a variable mentioned twice is undefined\, it seems to me that we can also consider a whole whack of fetch magic to also be undefined.

Thus we could check for magic at the beginning of the expression\, and then cache it for the duration\, although we would guarantee that the magic WAS called twice if there was magic.

But the problem with any of these changes is that it could/would break stuff somewhere. Which is why i figured some kind of compiler hint was in order\, as it would mean that new code could be optimised relatively sanely and old code would continue on unbroken.

Cheers\, yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @rurban

2010/7/12 David Golden \xdaveg@&#8203;gmail\.com​:

On Sun\, Jul 11\, 2010 at 5​:58 PM\, Joshua ben Jore \twists@&#8203;gmail\.com wrote​:

On Sun\, Jul 11\, 2010 at 2​:56 PM\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Sun\, Jul 11\, 2010 at 3​:40 PM\, David Golden \xdaveg@&#8203;gmail\.com wrote​: I'm suggesting that we disclaim any implicit guarantee that the compiler won't optimize away expressions that have side effects when evaluated.

Whether any particular optimization is "worth it" is open for later debate\, but at least the door would be open.

That's not the point. Even with sideeffects from mg_get we can optimize this conditional to $a only.

perl -MO=Concise\,-exec -e'$a and "cmp" eq "cc"' 1 \<0> enter 2 \<;> nextstate(main 1 -e​:1) v​:{ 3 \<#> gvsv[*a] s 4 \<|> and(other->5) vK/1 5 \<@​> leave[1 ref] vKP/REFC

can be optimized to perl -MO=Concise\,-exec -e'$a' 1 \<0> enter 2 \<;> nextstate(main 1 -e​:1) v​:{ 3 \<#> gvsv[*a] s 4 \<@​> leave[1 ref] vKP/REFC

gvsv is just checking magic and doing the sideeffect\, and there would be no better op to cut through that.

So the question if we should assert for less magic is bogus\, as gvsv is doing the needed run.time check super cheap. We could gain a little if we know about the lvalue context\, to get rid of pp_hot​:pp_gvsv if (PL_op->op_private & OPpLVAL_INTRO)   PUSHs(save_scalar(cGVOP_gv));   else   PUSHs(GvSV(cGVOP_gv));   RETURN;

Notwithstanding that Nicholas points out that this example isn't what he was talking about\, the question that has been raised is whether we could just short-circuit an *entire* logical operation if "static" analysis can determine whether it is true or false.

Effectively​:

   if ($a && 0) { ... } # could be optimized away entirely

Not entirely. The pp_and the {} block could be optimized away to gvsv $a - for the get magic.

if ($a && 0) { ... } => $a

This would nullify a lot of ops if the {} block is large\, where we would come to a state where copying this part of the optree to exec order and running the defragmented tree without null ops would be actually faster then nullify it at compile-time and running the nullified original tree at run-time. This could be known in advance in the optimizer by some heuristic\, when the cost of compile-time defragmentation is less than nullifiying and skipping the null ops.

Bad that the optimizer module needs some core support. op_seq is gone. But I can try that in the XS first. -- Reini

p5pRT commented 14 years ago

From @iabyn

On Mon\, Jul 12\, 2010 at 06​:06​:15PM +0200\, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also\, just as a data point\, note that pp_concat *explicitly* calls get magic twice on $a . $a​:

  if (left == right)   /* $r.$r​: do magic twice​: tied might return different 2nd time */   SvGETMAGIC(right);

-- Justice is when you get what you deserve. Law is when you get what you pay for.

p5pRT commented 14 years ago

From @rurban

2010/7/12 Dave Mitchell \davem@&#8203;iabyn\.com​:

On Mon\, Jul 12\, 2010 at 06​:06​:15PM +0200\, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Oops\, right. Something like op_defined or op_ref would be the cheapest existing op then doing the SvGETMAGIC\, or we would need a new op\, probably named pp_sideeffect or pp_getmagic.

Also\, just as a data point\, note that pp_concat *explicitly* calls get magic twice on $a . $a​:

       if (left == right)            /* $r.$r​: do magic twice​: tied might return different 2nd time */            SvGETMAGIC(right); -- Reini

p5pRT commented 14 years ago

From @ikegami

On Mon\, Jul 12\, 2010 at 7​:07 AM\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

I said "... have side effects when evaluated ..." but your example has a side effect (functional call) in an assignment before the logic expression.

I realise you knew it would break. What I was pointing out is that it breaks a common idiom. "or" and "and" are used for flow control all over in my code. You're suggesting we can no longer count on the argument evaluation order of logical operators. It has long been documented that "the right expression is evaluated only if the left expression is false."\, but your suggestion is to evaluate the RHS first if it's constant. It would change the function of a fundemental operator.

p5pRT commented 14 years ago

From @demerphq

On 12 July 2010 20​:30\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Mon\, Jul 12\, 2010 at 06​:06​:15PM +0200\, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also\, just as a data point\, note that pp_concat *explicitly* calls get magic twice on $a . $a​:

       if (left == right)            /* $r.$r​: do magic twice​: tied might return different 2nd time */            SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee any particular order in this case for the fetch calls and thus the statement is /still/ undefined even with this change.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From ben@morrow.me.uk

Quoth demerphq@​gmail.com (demerphq)​:

On 12 July 2010 20​:30\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Mon\, Jul 12\, 2010 at 06​:06​:15PM +0200\, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also\, just as a data point\, note that pp_concat *explicitly* calls get magic twice on $a . $a​:

� � � �if (left == right) � � � � � �/* $r.$r​: do magic twice​: tied might return different 2nd time */ � � � � � �SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee any particular order in this case for the fetch calls and thus the statement is /still/ undefined even with this change.

Perl doesn't have undefined behaviour. No matter what weasel words copied from stdc made it into the ++ docs\, Perl's actual evaluation order has always been straightforward and well-defined. Changing this may be worth it\, for a sufficiently beneficial optimisation\, but it is definitely a backwards-incompatible change.

Ben

p5pRT commented 14 years ago

From @demerphq

On 13 July 2010 13​:43\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Quoth demerphq@​gmail.com (demerphq)​:

On 12 July 2010 20​:30\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Mon\, Jul 12\, 2010 at 06​:06​:15PM +0200\, Reini Urban wrote​:

gvsv is just checking magic and doing the sideeffect

Huh? gvsv doesn't call magic.

Also\, just as a data point\, note that pp_concat *explicitly* calls get magic twice on $a . $a​:

       if (left == right)            /* $r.$r​: do magic twice​: tied might return different 2nd time */            SvGETMAGIC(right);

Id like to argue that this was misguided. I dont think we guarantee any particular order in this case for the fetch calls and thus the statement is /still/ undefined even with this change.

Perl doesn't have undefined behaviour. No matter what weasel words copied from stdc made it into the ++ docs\, Perl's actual evaluation order has always been straightforward and well-defined. Changing this may be worth it\, for a sufficiently beneficial optimisation\, but it is definitely a backwards-incompatible change.

Just so everyone can conveniently see​:

From perldoc perlop​:

  Auto-increment and Auto-decrement

  "++" and "--" work as in C. That is\, if placed before a variable\, they increment or decrement the variable by one before returning the value\, and if placed after\, increment or decrement after returning the value.

  $i = 0; $j = 0;   print $i++; # prints 0   print ++$j; # prints 1

  Note that just as in C\, Perl doesn’t define when the variable is incremented or decremented. You just know it will be done sometime before or after the value is returned. This also means that modifying a variable twice in   the same statement will lead to undefined behaviour. Avoid statements like​:

  $i = $i ++;   print ++ $i + $i ++;

  Perl will not guarantee what the result of the above statements is.

If we are going to say that these statements are well defined then we should probably document exactly what the rules are\, as well as correcting the above docs.

Ill just say that in this case I would much prefer we dont change the documentation\, except to make this much more prominent in the Tie and Overload documentation. There is more benefit for more people if we can take advantage of the undefinedness than there is harm done to people doing naughty things like this despite the documentation (they are saved only by the lack of prominence of this documentation).

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @avar

On Tue\, Jul 13\, 2010 at 11​:43\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Perl doesn't have undefined behaviour. No matter what weasel words copied from stdc made it into the ++ docs\, Perl's actual evaluation order has always been straightforward and well-defined. Changing this may be worth it\, for a sufficiently beneficial optimisation\, but it is definitely a backwards-incompatible change.

Undefined doesn't mean that the implementation doesn't act consistently\, just that its documentation explicitly denies responsibility for having those things work in the future. If they work now they only work incidentally\, and you shouldn't rely on them.

Of course we can't liberally change things that are documented to be undefined as liberally as a C compiler would\, becuase there's only one perl(1) but multiple cc(1)'s.

p5pRT commented 14 years ago

From @nwc10

On Tue\, Jul 13\, 2010 at 01​:11​:11PM +0000\, Ævar Arnfjörð Bjarmason wrote​:

On Tue\, Jul 13\, 2010 at 11​:43\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Perl doesn't have undefined behaviour. No matter what weasel words copied from stdc made it into the ++ docs\, Perl's actual evaluation order has always been straightforward and well-defined. Changing this may be worth it\, for a sufficiently beneficial optimisation\, but it is definitely a backwards-incompatible change.

Undefined doesn't mean that the implementation doesn't act consistently\, just that its documentation explicitly denies responsibility for having those things work in the future. If they work now they only work incidentally\, and you shouldn't rely on them.

http​://www.lysator.liu.se/c/c-faq/c-5.html#5-23

  Briefly​: implementation-defined means that an implementation must choose   some behavior and document it. Unspecified means that an implementation   should choose some behavior\, but need not document it. Undefined means   that absolutely anything might happen.

I suspect that all of our documentation should say "unspecified" rather than "undefined".

Of course we can't liberally change things that are documented to be undefined as liberally as a C compiler would\, becuase there's only one perl(1) but multiple cc(1)'s.

But whatever we call it\, that's the key problem. There is only one implementation\, and as that implementation strives hard to internally avoid C undefined behaviour\, its output will be deterministic\, in some fashion. Hence people come to rely on the current behaviour of the implementation\, documented or not.

Nicholas Clark

p5pRT commented 14 years ago

From @druud62

Reini Urban wrote​:

Even with sideeffects from mg_get we can optimize this conditional to $a only.

perl -MO=Concise\,-exec -e'$a and "cmp" eq "cc"'

Or to "$a; undef;".

-- Ruud

p5pRT commented 14 years ago

From ben@morrow.me.uk

Quoth nick@​ccl4.org (Nicholas Clark)​:

On Tue\, Jul 13\, 2010 at 01​:11​:11PM +0000\, �var Arnfj�r� Bjarmason wrote​:

Of course we can't liberally change things that are documented to be undefined as liberally as a C compiler would\, becuase there's only one perl(1) but multiple cc(1)'s.

But whatever we call it\, that's the key problem. There is only one implementation\, and as that implementation strives hard to internally avoid C undefined behaviour\, its output will be deterministic\, in some fashion. Hence people come to rely on the current behaviour of the implementation\, documented or not.

Quite. And

  print $i++\, $i++;

has DWIM forever (probably since perl 1). I'm not saying we *cannot* change it\, just that any change needs to be either only within the scope of a lexical pragma or to go through a full deprecation cycle with mandatory warnings before it changes.

Ben

p5pRT commented 14 years ago

From @xdg

On Tue\, Jul 13\, 2010 at 1​:55 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

has DWIM forever (probably since perl 1). I'm not saying we *cannot* change it\, just that any change needs to be either only within the scope of a lexical pragma or to go through a full deprecation cycle with mandatory warnings before it changes.

I disagree.

For anything which is *documented* as "undefined" (even when we mean "unspecified") we should feel free to change whenever we think the benefits outweigh the costs\, without any recourse to a deprecation cycle.

Fo anything which is *undocumented* (but that people have come to rely on)\, we should not change without a deprecation cycle (short of a security vulnerability\, anyway).

-- David

p5pRT commented 14 years ago

From @ikegami

On Tue\, Jul 13\, 2010 at 1​:55 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Quite. And

print $i++\, $i++;

has DWIM forever (probably since perl 1).

Bad example. The operand evaluation order for the comma operator is not undefined. It's documented that the arguments are evaluated from left to right (allowing the comma can be used a "light" semicolon). A better example would be

print $i-- + $i++;

The operand evaluation order of addition is not documented\, so it could return 5 or 7 for $i=3. But it always returns 5.

p5pRT commented 14 years ago

From @demerphq

On 14 July 2010 00​:24\, Eric Brine \ikegami@&#8203;adaelis\.com wrote​:

On Tue\, Jul 13\, 2010 at 1​:55 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

Quite. And

   print $i++\, $i++;

has DWIM forever (probably since perl 1).

Bad example. The operand evaluation order for the comma operator is not undefined. It's documented that the arguments are evaluated from left to right (allowing the comma can be used a "light" semicolon). A better example would be

print $i-- + $i++;

The operand evaluation order of addition is not documented\, so it could return 5 or 7 for $i=3. But it always returns 5.

I concur. Certain operators make explicit guarantees about order of evaluation\, and as such we cannot change them. Most operators do not\, for instance non of the mathematical or comparison operators\, nor concatenation. Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @demerphq

On 13 July 2010 20​:26\, David Golden \xdaveg@&#8203;gmail\.com wrote​:

On Tue\, Jul 13\, 2010 at 1​:55 PM\, Ben Morrow \ben@&#8203;morrow\.me\.uk wrote​:

has DWIM forever (probably since perl 1). I'm not saying we *cannot* change it\, just that any change needs to be either only within the scope of a lexical pragma or to go through a full deprecation cycle with mandatory warnings before it changes.

I disagree.

For anything which is *documented* as "undefined" (even when we mean "unspecified") we should feel free to change whenever we think the benefits outweigh the costs\, without any recourse to a deprecation cycle.

I agree with this. However we would have to document that it might have changed and caused back-compat issues.

Fo anything which is *undocumented* (but that people have come to rely on)\, we should not change without a deprecation cycle (short of a security vulnerability\, anyway).

Agree.

yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @iabyn

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

$a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c1bf85fd984f18f84ea834be52b168c812 Author​: Hugo van der Sanden \hv@&#8203;crypt\.org AuthorDate​: Sat May 26 18​:05​:12 2001 +0100 Commit​: Jarkko Hietaniemi \jhi@&#8203;iki\.fi CommitDate​: Sat May 26 22​:31​:46 2001 +0000

  Re​: 5.6.*\, bleadperl​: bugs in pp_concat   Message-Id​: \200105261605\.RAA12295@&#8203;crypt\.compulink\.co\.uk

-- The Enterprise is involved in a bizarre time-warp experience which is in some way unconnected with the Late 20th Century.   -- Things That Never Happen in "Star Trek" #14

p5pRT commented 14 years ago

From @demerphq

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c1bf85fd984f18f84ea834be52b168c812 Author​:     Hugo van der Sanden \hv@&#8203;crypt\.org AuthorDate​: Sat May 26 18​:05​:12 2001 +0100 Commit​:     Jarkko Hietaniemi \jhi@&#8203;iki\.fi CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*\, bleadperl​: bugs in pp_concat    Message-Id​: \200105261605\.RAA12295@&#8203;crypt\.compulink\.co\.uk

I'm not sure what your point is? Simply because Hugo wrote/pushed a patch that somehow proves something? I don't think so. Just because a commiter didn't think through the full ramifications of a patch\, or even knew of the ramifications but still went through with it on the grounds of providing "least worst" behaviour doesn't make that patch law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well specified\, then it would mean that one can take a construct documented to have unspecified behaviour wrap it up in a tie to resolve the unspecifiedness\, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified\, as the docs say it is\, the onus instead is on those who disagree with the documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted doesn't essentially boil down to a position that the docs are meaningless and that whatever is committed is right. If so then you might as well stop fixing those "bugs" as they aren't really "bugs" then are they? I'm pretty sure you don't think this\, so why do you think that this patch is different?

cheers\, Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @iabyn

On Thu\, Jul 15\, 2010 at 10​:31​:51AM +0200\, demerphq wrote​:

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c1bf85fd984f18f84ea834be52b168c812 Author​:     Hugo van der Sanden \hv@&#8203;crypt\.org AuthorDate​: Sat May 26 18​:05​:12 2001 +0100 Commit​:     Jarkko Hietaniemi \jhi@&#8203;iki\.fi CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*\, bleadperl​: bugs in pp_concat    Message-Id​: \200105261605\.RAA12295@&#8203;crypt\.compulink\.co\.uk

I'm not sure what your point is? Simply because Hugo wrote/pushed a patch that somehow proves something?

I wasn't making a point\, I was just providing information.

I don't think so. Just because a commiter didn't think through the full ramifications of a patch\, or even knew of the ramifications but still went through with it on the grounds of providing "least worst" behaviour doesn't make that patch law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well specified\, then it would mean that one can take a construct documented to have unspecified behaviour wrap it up in a tie to resolve the unspecifiedness\, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified\, as the docs say it is\, the onus instead is on those who disagree with the documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted doesn't essentially boil down to a position that the docs are meaningless and that whatever is committed is right. If so then you might as well stop fixing those "bugs" as they aren't really "bugs" then are they? I'm pretty sure you don't think this\, so why do you think that this patch is different?

Just to make it clear\, I didn't post that patch to prove a point one way of another\, I just dug it up as a point of info so that people could\, if interested\, examine it\, look at at the reasoning behind it (e.g. the p5p discussion if any)\, and draw whatever conclusions they want. For the record\, I haven't read the original 2001 p5p thread\, and haven't drawn any conclusions.

However\, for my opinions for the topic in hand...

as regards tiedness\, there are actually two orthogonal issues of correctness. The first is which order in which the two $a's in $a.$a are evaluated; the second is how many times $a is evaluated. It is quite possible for the order not to be defined\, but still for the fact that $a is evaluated twice to be defined. For example\, someone might be using tie to instrument the number of accesses to a variable. Having said that\, tied hash elements only have mg_get called once on them until reset by a mg_set\, and I recently extended that mechanism to tied arrays too.

On the other hand\, it may not be documented or specified\, but I think most people would expect that in the following\, f() is called before g()​:   $f() . $g()

Finally\, my feeling is that any 'no magic;' scopes aren't really viable in terms of providing enough guarantees of side-effects for aggressive optimisation while still providing perly behaviour.

-- There's a traditional definition of a shyster​: a lawyer who\, when the law is against him\, pounds on the facts; when the facts are against him\, pounds on the law; and when both the facts and the law are against him\, pounds on the table.   -- Eben Moglen referring to SCO

p5pRT commented 14 years ago

From @rurban

2010/7/15 demerphq \demerphq@&#8203;gmail\.com​:

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here. Hugo destroyed language semantics with 5.6.2\, if I read the pp_hot.c patch right.

$a . $a must evaluate $a twice\, and not only once\, even if it saves coding lines for the construct my $b = $a; $b . $b which should be done if you want to evaluate $a only once. We don't want to surprise users in favor of having to write less code.

In our case we need​: 1. document this special evaluation rule for the pp_concat op ("if both sides of . refer to the same tied variable\, the tied access is only done once\, contrary to the obvious") or preferred​: 2. revert the pp_concat patch from Hugo\,   so that $a.$a evaluates mg_get twice again for $a

This is would you expect from reading $a . $a. If you want to evaluate it once\, do it once. Typical semantics would be { my $b = $a; $b . $b }

But we certainly need a testcase for this mg_get sideeffect.

commit 8d6d96c1bf85fd984f18f84ea834be52b168c812 Author​:     Hugo van der Sanden \hv@&#8203;crypt\.org AuthorDate​: Sat May 26 18​:05​:12 2001 +0100 Commit​:     Jarkko Hietaniemi \jhi@&#8203;iki\.fi CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*\, bleadperl​: bugs in pp_concat    Message-Id​: \200105261605\.RAA12295@&#8203;crypt\.compulink\.co\.uk

I'm not sure what your point is? Simply because Hugo wrote/pushed a patch that somehow proves something? I don't think so. Just because a commiter didn't think through the full ramifications of a patch\, or even knew of the ramifications but still went through with it on the grounds of providing "least worst" behaviour doesn't make that patch law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well specified\, then it would mean that one can take a construct documented to have unspecified behaviour wrap it up in a tie to resolve the unspecifiedness\, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified\, as the docs say it is\, the onus instead is on those who disagree with the documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted doesn't essentially boil down to a position that the docs are meaningless and that whatever is committed is right. If so then you might as well stop fixing those "bugs" as they aren't really "bugs" then are they? I'm pretty sure you don't think this\, so why do you think that this patch is different? -- Reini Urban http​://phpwiki.org/           http​://murbreak.at/

p5pRT commented 14 years ago

From @iabyn

On Thu\, Jul 15\, 2010 at 01​:28​:58PM +0200\, Reini Urban wrote​:

2010/7/15 demerphq \demerphq@&#8203;gmail\.com​:

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here. Hugo destroyed language semantics with 5.6.2\, if I read the pp_hot.c patch right.

No I think you're reading it the wrong way. Hugo's patch ensures that $a is evaluated *twice* in $a.$a.

-- Standards (n). Battle insignia or tribal totems.

p5pRT commented 14 years ago

From @rurban

2010/7/15 Dave Mitchell \davem@&#8203;iabyn\.com​:

On Thu\, Jul 15\, 2010 at 01​:28​:58PM +0200\, Reini Urban wrote​:

2010/7/15 demerphq \demerphq@&#8203;gmail\.com​:

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

I'm also with Yves here. Hugo destroyed language semantics with 5.6.2\, if I read the pp_hot.c patch right.

No I think you're reading it the wrong way. Hugo's patch ensures that $a is evaluated *twice* in $a.$a.

Great! I see it now in the two gmagic tests. So I'm not with Yves anymore and nothing needs to be done there.

I still have no time to finish my simple optimizer rule for Nicks original report\, so I attach my latest approach. Maybe someone else might want to finish and test it. This weekend I'm away -- Reini

p5pRT commented 14 years ago

From @rurban

#! perl

=head1 DESCRIPTION

optimize (and ... NO) to null if no gvsv/padsv, else (dor $x) or do some SvGETMAGIC
(and NO) is always false, but all SVs must call their mg_get for all SVs before not

=head1 EXAMPLE1 gvsv

    $ perl -MO=Concise,-exec -e'if ($a and "x" eq "y") { print $s;}'
    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <$> gvsv(*a) s
    4  <|> and(other->5) sK/1
    5      <$> const(SPECIAL sv_no) s
    6  <|> and(other->7) vK/1
    7      <0> pushmark s
    8      <$> gvsv(*s) s
    9      <@> print vK
    a  <@> leave[1 ref] vKP/REFC

can be optimized to

    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <$> gvsv(*a) s
    4  <1> dor vK/1
    a  <@> leave[1 ref] vKP/REFC

=head1 EXAMPLE2 padsv

    $ perl -MO=Concise,-exec -e'my $a; if ($a and "x" eq "y") { print $s;}'

    1  <0> enter
    2  <;> nextstate(main 1 -e:1) v:{
    3  <0> padsv[$a:1,4] vM/LVINTRO
...
    4  <;> nextstate(main 4 -e:1) v:{
    5  <0> padsv[$a:1,4] s
    6  <|> and(other->7) sK/1
    7      <$> const[SPECIAL sv_no] s
    8  <|> and(other->9) vK/1
    9      <0> pushmark s
    a      <#> gvsv[*s] s
    b      <@> print vK
    c  <@> leave[1 ref] vKP/REFC

can be optimized to

    1  <0> enter
    2  <;> nextstate(main 1 -e:1) v:{
    3  <0> padsv[$a:1,3] vM/LVINTRO
...
    4  <;> nextstate(main 2 -e:1) v:{
    5  <$> padsv([$a:1,3) s
    6  <1> dor vK/1
    7  <@> leave[1 ref] vKP/REFC

=head1 EXAMPLE3 ok

    $ perl -MO=Concise,-exec -e'if ("x" eq "y" and $a) { print $s;}'

is already optimized to

    1  <0> enter
    2  <;> nextstate(main 3 -e:1) v:{
    3  <@> leave[1 ref] vKP/REFC

=cut

use optimizer;
use B::Generate;

use optimizer callback => sub {
  my $o = shift;
  if (($o->name eq 'gvsv' or $o->name eq 'padsv')
      and ${$o->next} and {$o->next}->name eq 'and'
      and ${$o->next->next} and {$o->next->next}->name eq 'const'
      and {$o->next->next}->sv == B::sv_no
     )
  {
    # change o->next to dor and nullify the rest
  }
};
p5pRT commented 14 years ago

From @demerphq

On 15 July 2010 13​:24\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Thu\, Jul 15\, 2010 at 10​:31​:51AM +0200\, demerphq wrote​:

On 15 July 2010 01​:12\, Dave Mitchell \davem@&#8203;iabyn\.com wrote​:

On Wed\, Jul 14\, 2010 at 09​:39​:35AM +0200\, demerphq wrote​:

Whomever decided that

  $a . $a

is specified when $a is tied and returns a different value each fetch had forgotten this fact.

You'll have to argue that with Hugo then!

commit 8d6d96c1bf85fd984f18f84ea834be52b168c812 Author​:     Hugo van der Sanden \hv@&#8203;crypt\.org AuthorDate​: Sat May 26 18​:05​:12 2001 +0100 Commit​:     Jarkko Hietaniemi \jhi@&#8203;iki\.fi CommitDate​: Sat May 26 22​:31​:46 2001 +0000

   Re​: 5.6.*\, bleadperl​: bugs in pp_concat    Message-Id​: \200105261605\.RAA12295@&#8203;crypt\.compulink\.co\.uk

I'm not sure what your point is? Simply because Hugo wrote/pushed a patch that somehow proves something?

I wasn't making a point\, I was just providing information.

I don't think so. Just because a commiter didn't think through the full ramifications of a patch\, or even knew of the ramifications but still went through with it on the grounds of providing "least worst" behaviour doesn't make that patch law over long existing documentation.

The documentation for ++ is pretty clear.

If the concatenation of a tied variable that mutates is well specified\, then it would mean that one can take a construct documented to have unspecified behaviour wrap it up in a tie to resolve the unspecifiedness\, which seems to me to be simply absurd.

Thus the onus is not on me to show why this is unspecified\, as the docs say it is\, the onus instead is on those who disagree with the documentation to find a way to get out of this logical absurdity.

I have to say that I'm struggling to see why what you just posted doesn't essentially boil down to a position that the docs are meaningless and that whatever is committed is right. If so then you might as well stop fixing those "bugs" as they aren't really "bugs" then are they? I'm pretty sure you don't think this\, so why do you think that this patch is different?

Just to make it clear\, I didn't post that patch to prove a point one way of another\, I just dug it up as a point of info so that people could\, if interested\, examine it\, look at at the reasoning behind it (e.g. the p5p discussion if any)\, and draw whatever conclusions they want. For the record\, I haven't read the original 2001 p5p thread\, and haven't drawn any conclusions.

My apologies for making the incorrect inference. Clarification understood.

However\, for my opinions for the topic in hand...

as regards tiedness\, there are actually two orthogonal issues of correctness. The first is which order in which the two $a's in $a.$a are evaluated; the second is how many times $a is evaluated. It is quite possible for the order not to be defined\, but still for the fact that $a is evaluated twice to be defined.

Yes\, agreed. i dont have any issue with calling tie twice\, and would expect that it happens. I just wouldn't expect it to happen in a particular order\, nor that that patch makes the expression /defined/.

For example\, someone might be using tie to instrument the number of accesses to a variable. Having said that\, tied hash elements only have mg_get called once on them until reset by a mg_set\, and I recently extended that mechanism to tied arrays too.

What does this mean exactly.

On the other hand\, it may not be documented or specified\, but I think most people would expect that in the following\, f() is called before g()​:    $f() . $g()

Hmm. I don't know that I would. If we want this to be the case then IMO we should document it.

Finally\, my feeling is that any 'no magic;' scopes aren't really viable in terms of providing enough guarantees of side-effects for aggressive optimisation while still providing perly behaviour.

Ok\, thanks. What is the main problem as you see it?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 14 years ago

From @hvds

Dave Mitchell \davem@&#8203;iabyn\.com wrote​: :However\, for my opinions for the topic in hand... : :as regards tiedness\, there are actually two orthogonal issues of :correctness. The first is which order in which the two $a's in $a.$a are :evaluated; the second is how many times $a is evaluated. It is quite :possible for the order not to be defined\, but still for the fact that $a :is evaluated twice to be defined. For example\, someone might be using tie :to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of evaluation for this case\, but I would be unhappy about any change to the number of times magic is invoked unless there were first strong evidence presented that substantial improvements (to speed or something else) would justify the change.

Hugo

p5pRT commented 14 years ago

From @jandubois

On Fri\, 16 Jul 2010\, hv@​crypt.org wrote​:

Dave Mitchell \davem@&#8203;iabyn\.com wrote​: :However\, for my opinions for the topic in hand... : :as regards tiedness\, there are actually two orthogonal issues of :correctness. The first is which order in which the two $a's in $a.$a are :evaluated; the second is how many times $a is evaluated. It is quite :possible for the order not to be defined\, but still for the fact that $a :is evaluated twice to be defined. For example\, someone might be using tie :to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of evaluation for this case\, but I would be unhappy about any change to the number of times magic is invoked unless there were first strong evidence presented that substantial improvements (to speed or something else) would justify the change.

Could you explain _why_ you would care about invoking magic twice\, but don't care about the order of evaluation?

And could you also explain why it makes sense that $a.$a has to invoke magic twice\, while $a x 2 will only call it once?

Cheers\, -Jan

p5pRT commented 14 years ago

From @nwc10

On Tue\, Jul 20\, 2010 at 01​:12​:29PM -0700\, Jan Dubois wrote​:

On Fri\, 16 Jul 2010\, hv@​crypt.org wrote​:

Dave Mitchell \davem@&#8203;iabyn\.com wrote​: :However\, for my opinions for the topic in hand... : :as regards tiedness\, there are actually two orthogonal issues of :correctness. The first is which order in which the two $a's in $a.$a are :evaluated; the second is how many times $a is evaluated. It is quite :possible for the order not to be defined\, but still for the fact that $a :is evaluated twice to be defined. For example\, someone might be using tie :to instrument the number of accesses to a variable.

This agrees with my thinking - I do not care a jot about the order of evaluation for this case\, but I would be unhappy about any change to the number of times magic is invoked unless there were first strong evidence presented that substantial improvements (to speed or something else) would justify the change.

Could you explain _why_ you would care about invoking magic twice\, but don't care about the order of evaluation?

And could you also explain why it makes sense that $a.$a has to invoke magic twice\, while $a x 2 will only call it once?

On this part\, I believe that I agree with Hugo\, because my answer is​:

I read $a . $a as equivalent to $x . $y\, where it happens that $x and $y alias the same value. $a was *written* twice by the programmer\, so as there are two references to it\, it gets accessed *exactly* twice.

Whereas $a x 2 has $a *written* once by the programmer\, so as there is only one reference to it\, it gets accessed *exactly* once.

Basically\, I view tie as active data\, with an implied contract that it will be called once for each semantic read\, and that this should be honoured.

Hence I don't view $a . $a and $a x 2 as identical and interchangeable - if the programmer wanted the other\, he/she should have written the other. Yes\, this means that the compiler can't perform strength reduction or other optimisations in the general case. But I'm thinking of this from a perspective of "hooks exist to intercept the actions of the runtime" therefore the compiler isn't *allowed* to consider that transformations that are semantically valid for passive data are generally valid\, because Perl *allows* active data.

(Overload\, on the other hand\, I view as should-be-idempotent. I see its role as different. overload is expression of values. tie is a system to implement side effects)

Nicholas Clark

p5pRT commented 14 years ago

From @jandubois

On Tue\, 20 Jul 2010\, Nicholas Clark wrote​:

I read $a . $a as equivalent to $x . $y\, where it happens that $x and $y alias the same value. $a was *written* twice by the programmer\, so as there are two references to it\, it gets accessed *exactly* twice.

In that case I think you'll find plenty of places where it is accessed more often than you expect. E.g. ++$a might access $a once if it is just SV_pIOK\, or twice if it is just SV_pNOK\, because sv_inc() will first try to see if it can't convert the NV to an IV\, triggering an additional FETCH call​:

  flags = SvFLAGS(sv);   if ((flags & (SVp_NOK|SVp_IOK)) == SVp_NOK) {   /* It's (privately or publicly) a float\, but not tested as an   integer\, so test it to see. */   (void) SvIV(sv);   flags = SvFLAGS(sv);   }

It is easy to guarantee that each tied variable is fetched at least once for each time it is mentioned in the source code\, but it is extremely hard to guarantee that it isn't called more often​: Any innocent looking SvIV()\, SvNV() or SvPV() call anywhere in the core may trigger an additional call to FETCH a tied variable.

I prefer to view this as an inefficiency\, not as a bug\, because I think FETCH should be side-effect free.

Cheers\, -Jan

p5pRT commented 14 years ago

From @jandubois

Eirik Berg Hanssen wrote​:

On Tue\, Jul 20\, 2010 at 10​:12 PM\, Jan Dubois \jand@&#8203;activestate\.com wrote​:

And could you also explain why it makes sense that $a.$a has to invoke magic twice\, while $a x 2 will only call it once?

  For the same reason that f().f() calls &f twice\, while f() x 2 will only call it once?

Yes\, but f() may have side-effects\, whereas $a shouldn't have any (IMO).

As I wrote earlier to Nicholas\, FETCH may be called more than you expect anyways\, which already implicitly forbids side effect (unless you consider calling it more than strictly needed a bug)​:


sub foo​::TIESCALAR { bless \my $x => "foo" } sub foo​::FETCH { print "FETCH\n"; ${$_[0]} } sub foo​::STORE { print "STORE\n"; ${$_[0]} = $_[1] }

tie $a\, "foo";

print "NV\n"; $a = 1.; print "Inc\n"; ++$a; print "IV\n"; $a = 1; print "Inc\n"; ++$a;


NV STORE Inc FETCH FETCH STORE IV STORE Inc FETCH STORE


So FETCH is called twice when $a is a floating point number and only once when it is an integer.

But even assuming there is a canonical number of times FETCH should be called\, what is that number for "$b = $a++;"? Should it be 2\, because the expression is just a shorthand for "$b = $a; $a = $a + 1;"? Or is it fine to allow Perl to optimize this into a single access?

perlop.pod implies that there should be 2 accesses​:

  "++" and "--" work as in C. That is\, if placed before a variable\, they   increment or decrement the variable by one before returning the value\,   and if placed after\, increment or decrement after returning the value.

There is nothing there stating that the returned value can be re-used to increment the variable\, so by your logic it will have to be fetched again. Which is not how it is currently implemented (reasonably again\, IMO).

Cheers\, -Jan

p5pRT commented 14 years ago

From ebhanssen@cpan.org

On Tue\, Jul 20\, 2010 at 10​:12 PM\, Jan Dubois \jand@&#8203;activestate\.com wrote​:

And could you also explain why it makes sense that $a.$a has to invoke magic twice\, while $a x 2 will only call it once?

  For the same reason that f().f() calls &f twice\, while f() x 2 will only call it once?

Eirik