Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.92k stars 549 forks source link

Chained comparison evaluates tied scalar twice #17692

Closed briandfoy closed 4 years ago

briandfoy commented 4 years ago

Description

The perlop docs describe the current behavior, so this doesn't contradict anything.

A chained comparison in 5.31.10 appears to evaluate the middle condition twice for a tied scalar but doesn't warn about it. Since a programmer might not realize they are using a tied variable, they might see odd behavior.

Steps to Reproduce

use v5.31.10;

package UnstableScalar {
    use parent qw(Tie::Scalar);
    sub TIESCALAR { bless {}, $_[0] }
    sub FETCH {
        my $value = int rand 10;
        say "Fetching scalar with $value";
        return $value;
        }
    }

tie my $unstable, 'UnstableScalar';

if( 5 < $unstable < 9 ) {
    say "Found a value between 5 and 9";
    }

The comparison short circuits just fine, but if the first comparison is true, the tied scalar is evaluated again (so things such as Tie::Cycle will unexpectedly progress or skip values):

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 2

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 0

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 9
Fetching scalar with 2
Found a value between 5 and 9

$ perl5.31.10 ~/Desktop/test.pl
Fetching scalar with 6
Fetching scalar with 9

Expected behavior

I expect that the scalar would only call FETCH once and reuse the result in the second comparison. perlop says this may happen, but perhaps a warning would be useful here. Ideally, tied variables would be evaluated once.

Is there a chance that the internals could recognize this and use the tied object instead of the scalar itself? This works as expected:

    if( 5 < tied($unstable)->FETCH < 9 ) {
        say "Found a value between 5 and 9";
        }

Perl configuration

Summary of my perl5 (revision 5 version 31 subversion 10) configuration:

  Platform:
    osname=darwin
    osvers=19.3.0
    archname=darwin-2level
    uname='darwin otter.local 19.3.0 darwin kernel version 19.3.0: thu jan 9 20:58:23 pst 2020; root:xnu-6153.81.5~1release_x86_64 x86_64 '
    config_args='-des -Dprefix=/usr/local/perls/perl-5.31.10 -Dusedevel'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.17)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=10.15 -fstack-protector-strong -L/usr/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /usr/lib
    libs=-lpthread -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.15 -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_DEVEL
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
  Built under darwin
  Compiled at Mar 27 2020 01:45:56
  %ENV:
    PERL="/Users/brian/bin/perls/perl-latest"
    PERL5_PATH="/Users/brian/bin/perls"
    PERL6_PATH="/Users/brian/bin/perl6s:/Applications/Rakudo/bin:/Applications/Rakudo/share/perl6/site/bin"
    PERLDOTCOM_AUTHOR="brian d foy"
  @INC:
    /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10/darwin-2level
    /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10
    /usr/local/perls/perl-5.31.10/lib/5.31.10/darwin-2level
    /usr/local/perls/perl-5.31.10/lib/5.31.10
xsawyerx commented 4 years ago

I consider a single FETCH to be correct user behavior.

karenetheridge commented 4 years ago

should this be a 5.32 blocker?

jkeenan commented 4 years ago

should this be a 5.32 blocker?

This is already on the "Milestone 5.32.0" list -- which I take to be the 'blocker' list for 5.32.

toddr commented 4 years ago

Relevant mailing list thread: https://www.nntp.perl.org/group/perl.perl5.porters/2020/04/msg257322.html

xsawyerx commented 4 years ago

I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that.

Instead, I am in favor of a documentation update as @davidnicol suggested.

briandfoy commented 4 years ago

David Nicol suggested for the tied case that you capture the variable:

my $x = ...
if( 1 < $x < 10 ) { ... }

But, that assumes you know that something is tied, and the point of tie is that you don't have to know that. It's a feature where it takes more time to explain the edge case than the common case, and historically that sort of wart has been a pain point for Perl adoption. To effectively guard against the edge case, you are now checking the input for every case, which means it's easier to not use this feature.

I'm not sure which semantics dictate what. The mailing list thread was all implementation details, and specifically not language details. The language can decide what it wants to do with a feature, although I take it the implementation is harder because this isn't tokenized differently and is actually just shoehorning it into old syntax.

khwilliamson commented 4 years ago

@xenu submitted a comment that didn't make it into this ticket. The operation of this new feature is not as clear-cut as people previously thought. Shouldn't we make it experimental in 5.32? Since I think all new features should be experimental, based on my previous school of hard knocks, I agree with him.

jkeenan commented 4 years ago

On 4/16/20 11:34 AM, Karl Williamson wrote:

@xenu https://github.com/xenu submitted a comment that didn't make it into this ticket. The operation of this new feature is not as clear-cut as people previously thought. Shouldn't we make it experimental in 5.32? Since I think all new features should be experimental, based on my previous school of hard knocks, I agree with him.

I concur with Karl here; make it experimental.

demerphq commented 4 years ago

It is documented that it does this.

Yves

On Thu, 9 Apr 2020 at 13:55, Sawyer X notifications@github.com wrote:

I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that.

Instead, I am in favor of a documentation update as @davidnicol https://github.com/davidnicol suggested.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692#issuecomment-611487948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R7UYUHG7CXOBVFA3LDRLWZSLANCNFSM4L5U4YIQ .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq commented 4 years ago

This how it is documented to behave, why should this be a bug?

Yves

On Sat, 4 Apr 2020 at 10:31, brian d foy notifications@github.com wrote:

Description

A chained comparison in 5.31.10 appears to evaluate the middle condition twice for a tied scalar.

Steps to Reproduce

use v5.31.10;

package UnstableScalar { use parent qw(Tie::Scalar); sub TIESCALAR { bless {}, $_[0] } sub FETCH { my $value = int rand 10; say "Fetching scalar with $value"; return $value; } }

tie my $unstable, 'UnstableScalar'; say $unstable foreach ( 0 .. 2 );

say "Comparing----"; if( 5 < $unstable < 9 ) { say "Found a value between 5 and 9"; }

The comparison short circuits just fine, but if the first comparison is true, the tied scalar is evaluated again (so things such as Tie::Cycle will unexpectedly progress or skip values):

$ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 2

$ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 0

$ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 9 Fetching scalar with 2 Found a value between 5 and 9

$ perl5.31.10 ~/Desktop/test.pl Fetching scalar with 6 Fetching scalar with 9

Expected behavior

I expect that the scalar would only call FETCH once and reuse the result in the second comparison.

Perl configuration

Summary of my perl5 (revision 5 version 31 subversion 10) configuration:

Platform: osname=darwin osvers=19.3.0 archname=darwin-2level uname='darwin otter.local 19.3.0 darwin kernel version 19.3.0: thu jan 9 20:58:23 pst 2020; root:xnu-6153.81.5~1release_x86_64 x86_64 ' config_args='-des -Dprefix=/usr/local/perls/perl-5.31.10 -Dusedevel' hint=recommended useposix=true d_sigaction=define useithreads=undef usemultiplicity=undef use64bitint=define use64bitall=define uselongdouble=undef usemymalloc=n default_inc_excludes_dot=define bincompat5005=undef Compiler: cc='cc' ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -DPERL_USE_SAFE_PUTENV' optimize='-O3' cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.15 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include' ccversion='' gccversion='4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.17)' gccosandvers='' intsize=4 longsize=8 ptrsize=8 doublesize=8 byteorder=12345678 doublekind=3 d_longlong=define longlongsize=8 d_longdbl=define longdblsize=16 longdblkind=3 ivtype='long' ivsize=8 nvtype='double' nvsize=8 Off_t='off_t' lseeksize=8 alignbytes=8 prototype=define Linker and Libraries: ld='cc' ldflags =' -mmacosx-version-min=10.15 -fstack-protector-strong -L/usr/local/lib' libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /usr/lib libs=-lpthread -ldbm -ldl -lm -lutil -lc perllibs=-lpthread -ldl -lm -lutil -lc libc= so=dylib useshrplib=false libperl=libperl.a gnulibc_version='' Dynamic Linking: dlsrc=dl_dlopen.xs dlext=bundle d_dlsymun=undef ccdlflags=' ' cccdlflags=' ' lddlflags=' -mmacosx-version-min=10.15 -bundle -undefined dynamic_lookup -L/usr/local/lib -fstack-protector-strong'

Characteristics of this binary (from libperl): Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_COPY_ON_WRITE PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP PERL_OP_PARENT PERL_PRESERVE_IVUV PERL_USE_DEVEL PERL_USE_SAFE_PUTENV USE_64_BIT_ALL USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_LOCALE_TIME USE_PERLIO USE_PERL_ATOF Built under darwin Compiled at Mar 27 2020 01:45:56 %ENV: PERL="/Users/brian/bin/perls/perl-latest" PERL5_PATH="/Users/brian/bin/perls" PERL6_PATH="/Users/brian/bin/perl6s:/Applications/Rakudo/bin:/Applications/Rakudo/share/perl6/site/bin" PERLDOTCOM_AUTHOR="brian d foy" @INC: /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10/darwin-2level /usr/local/perls/perl-5.31.10/lib/site_perl/5.31.10 /usr/local/perls/perl-5.31.10/lib/5.31.10/darwin-2level /usr/local/perls/perl-5.31.10/lib/5.31.10

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R3YT56CGT33UJ4HXUDRK3V57ANCNFSM4L5U4YIQ .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

demerphq commented 4 years ago

The original documentation specified this.

Yves

On Thu, 9 Apr 2020 at 13:55, Sawyer X notifications@github.com wrote:

I'm going to revise my position. The semantics do dictate the FETCH be done twice. It's only correct. However, user expectation may fall short of that.

Instead, I am in favor of a documentation update as @davidnicol https://github.com/davidnicol suggested.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692#issuecomment-611487948, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAZ5R7UYUHG7CXOBVFA3LDRLWZSLANCNFSM4L5U4YIQ .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

khwilliamson commented 4 years ago

I learned in my $work career that when the person charged with creating user documentation of my creations was troubled by some detail of how it worked, that meant it failed the DWIM test. And I was almost certainly wrong. (Sometimes I worked with requirements, and sometimes was given free reign, but usually the requirements failed to address the level of details involved.) Usually I was wrong because I had had my head down in the coding details and could not see the forest for the trees. I learned that by treating the documenter/coder-designer relationship as a collaboration that a better product resulted. (That was somewhat unusual in that company, as the writers were paid a lot less and considered lower class citizens by many.)

I think we owe great deference to brian's opinions. He has proven himself adept at explaining perl to a wide audience. Certainly dismissing them is done at our peril. Hedging our bets by making the feature experimental for the first release is the only prudent course of action.

I now can't remember a case where the documenter was wrong and I was right. Perhaps there were some where further explanation convinced them that it needed to operate the way I had designed it. I of course do remember the times when I was right but was overruled by management, not involving the documenter. (For example, I remember feeling satisfaction when we got sued by a famous singer who missed getting to a performance because he expected the feature to work as I originally had implemented it. They never told me the result of that lawsuit)

toddr commented 4 years ago

I wonder if we could warn when tied variables are used in chained comparisons? If so I wonder how often that would be more prophetic than annoying? We'd only need to do that for a middle one.

Leont commented 4 years ago

TBH I really don't see the problem. 1 < $x < 2 calls fetch magic exactly as many times as 1 < $x && $x < 2. The number of times fetch is called is not dependent on the number of times that variable is mentioned (there are plenty of exceptions to that), it's dependent on the number of operations on it.

Both behaviors are reasonable, and one's expectations depend on wether you see chained comparison as one or two operations.

Leont commented 4 years ago

I wonder if we could warn when tied variables are used in chained comparisons? If so I wonder how often that would be more prophetic than annoying? We'd only need to do that for a middle one.

That doesn't sound like a good idea. 99% of tied fetches are idempotent so this wouldn't be a problem anyway.

Quite frankly, this whole discussion feels rather theoretical. The combination of circumstances to run into this seem incredibly unlikely. Ties are uncommon, non-idempotent ties are really uncommon and accidentally dealing with such a non-idempotent ties seems even unlikelier (given that passing it as an argument does exactly the assigning to a variable that solves this).

I don't think this is a real-world problem.

briandfoy commented 4 years ago

I started thinking about this because the confusion was one of the first reactions in Reddit to my post of the new feature. I didn't think of the tie at first because it's rare and uncommon. But, that's not the issue. It's about what people expect and what they get. None of these people care about what the implementor has to do to satisfy the expectation.

Here's roughly how regular people are going to react to this feature:

This is quite a number of curveballs for one feature. It would be conceptually much better if it did call subroutines or methods twice, but it doesn't because the implementation leaks through a little. A persons knowledge of the chaining logical operators interferes with their understanding of this chaining operator. That they aren't homologues and can't be doesn't matter to the person who's already read more than they intended..

The best feature would stop at the "Very cool!" step. In this case, there are two more conceptual levels the reader must journey through, and then an extra bonus confusion. Even if they never encounter a tie in their life, they know that the edges cases for this feature took longer to explain than the feature itself. They don't want to think about how rare or improbable this is, but if they read far enough (unlikely, which is a problem), they discover this feature isn't what they thought it was.

Documenting things (this or something else) tends to not to be helpful as we think it is. Virtually no one reads the docs (wouldn't that make life so much easier), and those who do tend to stop when they think they have the answer.

Leont commented 4 years ago

Oh, but if the subroutine or method returns a tied thingy or an lvalue, the value could be accessed twice.

That's not an or, but and and. You can't return a tied value from a sub, unless it's an lvalue sub.

Oh, it expands to two comparisons though—weird.

Why do you think this is weird?

briandfoy commented 4 years ago

Oh, it expands to two comparisons though—weird.

Why do you think this is weird?

It's not that I think this is weird. It's that I think most people won't think of it that way because that's not how they think of the concept. I've said about all I can say on the subject without repeating myself.

jkeenan commented 4 years ago

On 4/18/20 10:16 AM, brian d foy wrote:

I started thinking about this because the confusion was one of the first reactions in Reddit to my post of the new feature https://www.reddit.com/r/perl/comments/fu3auk/chain_comparisons_to_avoid_excessive_typing/fmb29k0/. I didn't think of the |tie| at first because it's rare and uncommon. But, that's not the issue. It's about what people expect and what they get. None of these people care about what the implementor has to do to satisfy the expectation.

Here's roughly how regular people are going to react to this feature:

  • Very cool!
  • Oh, it expands to two comparisons though—weird.
  • Oh, but it doesn't call subroutines or methods twice. Okay, cool again because that's the same as Python.
  • Oh, but if the subroutine or method returns a tied thingy or an lvalue, the value could be accessed twice.
  • Okay, so don't do that.

This is quite a number of curveballs for one feature. It would be conceptually much better if it did call subroutines or methods twice, but it doesn't because the implementation leaks through a little. A persons knowledge of the chaining logical operators interferes with their understanding of this chaining operator. That they aren't homologues and can't be doesn't matter to the person who's already read more than they intended..

The best feature would stop at the "Very cool!" step. In this case, there are two more conceptual levels the reader must journey through, and then an extra bonus confusion. Even if they never encounter a |tie| in their life, they know that the edges cases for this feature took longer to explain than the feature itself. They don't want to think about how rare or improbable this is, but if they read far enough (unlikely, which is a problem), they discover this feature isn't what they thought it was.

I'm not sure whether brian is arguing for not including this new functionality in perl-5.32.0, but I do think it reinforces the argument for introducing it as experimental.

Thank you very much. Jim Keenan

vpit commented 4 years ago

I think it should call FETCH twice since it should also call overloading twice (is that even implemented? I must admit I haven't followed this in detail). That said, making it experimental can't hurt.

davidnicol commented 4 years ago

imaginary complete starting over:

Constraints: chained comparisons must all be numeric or all be string. They also have to be in the same order, that is, composed of operators taken from one of these sets:

< <= ==

= == lt le eq gt ge eq

trying to mix comparators that are not all in the same set would be a parse-time syntax error.

When the parser identifies a comparison chain, it recognizes this

v0  c1 v1 c2 v2 [ ... cn vn]

and does this with it:

for i ( 1 .. n ){ when vi is magical or otherwise more dynamic than a plain old scalar, resolve it to either a plain old number or plain old string, depending. apply ci to v(i-1) and vi; when false RETURN FALSE } TRUE

this approach would mean comparator overloading (is that a thing? wow) wouldn't work, only number or string overloading. Also, tied stuff would get called once. Short-circuiting would continue to work.

What's the delta between what's in place now and the semantics I just described? I think it's just the highlighted bit, and that the loop is fully unrolled.

On Mon, Apr 20, 2020 at 5:18 PM Vincent Pit notifications@github.com wrote:

I think it should call FETCH twice since it should also call overloading twice (is that even implemented? I must admit I haven't followed this in detail). That said, making it experimental can't hurt.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692#issuecomment-616841320, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKCIPPF6TN4X6NT3N74JODRNTC4DANCNFSM4L5U4YIQ .

-- "Amageddon, enthen armagedfunki."

xsawyerx commented 4 years ago

I understand the concerns of parties here, but after reviewing it once again - and then again - I consider the current behavior correct and expected from a consistency perspective.

I understand a person might expect an optimization, but what I think we're discussing is for the optimization to become the proper use. That is, just because we might be able to use the variable only once and "cache" it for the statement doesn't mean we should consider that the syntax.

Instead, the only consistency is understanding how the code should work without the optimization, which is in two steps. The effect of this is that consistency, a FETCH would be called twice.

This reminds me of the @x = sort @x discussion. An author depended on this accidentally not unweakening the references, which was a bug as part of the optimization to make this into an in-place sort. The author that depended on this considered this mishap in behavior (only due to optimization) to be now a requirement of the syntax itself.

(Even when you don't call it an "optimization", the practice of avoiding a double call is - essentially - an optimization.)

dur-randir commented 4 years ago

Making this feature experimental and removing that status the next release costs nothing, but it ships as non-experimental and it happens to be not that obvious for users and needs to change (remember smartmatch) - it'd be a much harder way.

xsawyerx commented 4 years ago

Once it is experimental, it requires two releases as experimental. This means it will only be available as non-experimental in three versions (that is, three years), assuming it's not changed in any way. Making it experimental is meant to allow us to change it when we're still trying to figure it out.

If we don't consider it changing, requiring people to wait for three releases (three years) to have it does not gain anything.

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation. Making it run a function twice instead of once would make it incorrect. Caching the value of the FETCH tie magic would make it incorrect.

While it can be confusing, it is correct and should stay put.

davidnicol commented 4 years ago

I don't know about the positions of anyone else involved in this discussion but I'm entirely satisfied with suggesting putting double-quotes around tied string values and adding zero to tied numeric values in the documentation of the feature as a workaround to avoid "double-fetch" issues.

When there's a big long dereferencing expression, does the expression get evaluated twice or just the resolution of it?

print "element $i is in range\n" if lower <=

$outer->{$inner}{a}{b}{c}[$i] < upper;

or

print "element $i is in range\n" if lower <=

$outer->{$inner}{a}{b}{c}[$i]+0 < upper;

this seems like a less contrived scenario than explicitly tieing something by the coder, given dbx modules and so-on.

Which only makes one database call?

On Wed, Apr 29, 2020 at 4:36 PM Sawyer X notifications@github.com wrote:

While it can be confusing, it is correct and should stay put.

xenu commented 4 years ago

FYI: this issue made it to the frontpage of /r/programming: https://www.reddit.com/r/programming/comments/gdxe1p/should_x_foo_y_read_from_foo_once_or_twice_perl/

Of course, most of the commenters probably aren't Perl programmers but I think it's still interesting to hear what people think.

Grinnz commented 4 years ago

For completion's sake, here's what happens with two somewhat related situations: https://perl.bot/p/efhuah

Object overloads apply to the referenced object itself and so persist through any passing or copying of references. The numeric overload is called twice if both comparisons occur and can result in different comparisons each time.

Variable magic applies to the variable itself and does not persist through any passing or copying. It also cannot affect the resulting value directly but could have arbitrary side effects. This get magic is also called twice if both comparisons occur.

Grinnz commented 4 years ago

Given that, I think it would be easier to discuss this in terms of an object with overloading; it is both significantly more common, and can be the result of a more complex expression.

Grinnz commented 4 years ago

The complicating factor there being, of course, it's possible to overload every comparison operator individually.

toddr commented 4 years ago

Most of the reddit discussions don't seem to get that perl supports tied variables. They also seem to be quickly forgetting that in any language they smugly consider to be better, the right side isn't evaluated if the left side is false.

After reading reddit and some of the above examples, I think I've come to the conclusion that I would actually be more upset if $x < $y < $z behaved differently than $x < $y && $y < $z

Most of the examples I've seen so far are a little contrived to simplify the explanation. I get that. Can anyone think of a practical example of where this might actually be an issue? So far, the only things I've imagined are some sort of DBIC code or some sort of Math::BigInt object. However, I'm not sure I can imagine what the code would be that would unexpectedly misbehave.

In my day job, I honestly don't use many tied variables that change when accessed multiple times. If I did, I would probably emphasize to the coder that they better make the variable VERY clear and have a darn good reason to do so. i.e. $x > $a_variable_that_will_change_every_time_it_is_checked > $y. I think If I saw this code during review, I would probably ask questions. I would also be asking how many unit tests they had to confirm it worked as expected. No matter how this example is written, I would be concerned with the design of the tie.

briandfoy commented 4 years ago

tie is the red herring of this discussion, and if I had a time machine I'd go back to restate the issue. Forget about the tie. I've lost that fight and I'd be surprised if anyone is ever bit by it. But the docs still describe a situation that doesn't reflect reality.

As noted in the reddit thread, people are confused by the statement that the chained comparison is actually broken into two comparisons (a < b && b < c) and that statement is not true if b is a subroutine call. The actual Perl code is more complicated than that and something akin to this pseudocode:

 { temp = b; a < temp && temp < c }

Document it like that much of the problem goes away because there's not an exception for the subroutine case.

I think Python needs the same doc adjustment too.

briandfoy commented 4 years ago

$x > $a_variable_that_will_change_every_time_it_is_checked > $y. I think If I saw this code during review, I would probably ask questions.

You wouldn't see it in a code review. You wouldn't even think that it was happening because the person writing that code isn't thinking about tie or overloading. They provide this library that uses this feature for the limited situation they imagined.

Then someone else, far away, uses that library and does something a bit goofy because they have some task where black magic makes sense. It's not rare that people do unexpected and unforeseen things with the interfaces someone provides them.

Many people in the Reddit thread made the same conceptual mistake. They assume that they have control over their entire ecosystem and are responsible for all of the code, and that all code interacting with a feature will come from them. In reality, we know that we don't know how people are going to interact with our CPAN code, how other CPAN code might interact with our code, and so on. These effects can be far away from the code that we actually write. It's not about what we write and what we allow, but what the world writes and what they decide to tolerate.

gugod commented 4 years ago

I briefly played 41 < EXPR < 43 with EXPR being tied var, overloaded object, subroutine returning plain scalar, subroutine returning tied var, subroutine returning overladed object, arthmatic expression of those... interesting!

Now that I cannot "un-see" it... the meaning of 41 < EXPR < 43 is clear and unambigious to me but internal side-effect is not. In a sense that this chain is understood as a single operation. It doesn't seem matter that much that sometimes of chained comparison behaves differently than the un-chained version -- as long as the end-result is mathmatically correct.

Although I'm not a committer, I feel like it's a big promise to say that the side-effect must also be consistent with the unchained version and perhaps that will end up give us less resource in the future to iterate on the improvement of chained comparison.

When we see the unchained version like: EXPR_1 < EXPR_2 && EXPR_2 < EXPR_3, we are almost mentally hardwired and seeing: oh, there is a short-circuiting happenning and the second operant (EXPR_2 < EXPR_3) might not be evaluated, neither its side-effect.

Most of languages and their implementations do evaluate the EXPR_1 < EXPR_2 first then, optionally, EXPR_2 < EXPR_3. However, it should be logically equivalent to do the opposite: EXPR_2 < EXPR_3 first, then EXPR_1 < EXPR_2. (If an implementation like this exist, I'd like to know about it.)

And... if there is an language implementation that's not doing short-circuiting -- it is logically correct too. (Although I'll be yelling if my /bin/bash is not doing short-ciuiting on &&.)

To me, maintaining the mathmatical correctness of EXPR_1 < EXPR_2 < EXPR_3 is far more interesteing and useful then maintaining its side-effects. If we find another implementation to short-circuiting N-way chained comparison that is more performant/energy-saving than a traditional && short-circuiting, I don't see why we shouldn't give it a spin. Perhaps at some point we want to re-phrase the definition of this and avoid referrencing to the unchained version. If we fail on optimizing that, we learn something too.

jstaursky commented 4 years ago

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation...

While it can be confusing, it is correct and should stay put.

Isn't code that does not meeting user expectation by definition a bug? Your initial position of agreement with the change is sort of indicative of that fact. Not trying to be inflammatory but I would hope the opinion of a prominent author would carry enough weight to be considered at least experimental. Do not see the harm in doing that.

matthewpersico commented 4 years ago

My USD 0.02 is this: The current syntax for "between" is

$left < $value  and  $value < $right

In the extreme case, all the terms are function calls:

funcleft() < funcval() and funcval() < funcright()

For this case:

  1. You probably do NOT want to call funcval() more than once. That leads to doing:
    my $value = funcval();
    funcleft() < $value and $value < funcright();
  2. We expect, and don't need, funcright() to evaluate if funcleft() < $value is false.

I hope that the new syntax preserves these semantics. I see the whole point of this new syntax as eliminating the need to do the my $value = funcval(); setup above. Furthermore, I read the current code as:

funcleft() is less than $value and $value is less than funcright()

but i have to work hard to get that right (via my $value = funcval(); ). I read the new syntax as

funcval() is between funcleft() and funcright()

meaning that you "naturally" want to refer to the value once. If I have to worry about double eval of the thing I'm checking, I see no need to use the new syntax. And if you want the double eval, you can drop back to the current syntax with two compares joined by an and.

davidnicol commented 4 years ago

flavor to taste

package Acme::is_between; use Exporter 'import'; our @EXPORT = qw/is_between/ sub isbetween{ my ($middle, $left, $right) = @ ; return ($left <= $middle and $middle <= $right) if ($left <= $right); return ($left >= $middle and $middle >= $right) if ($left >= $right); return ($left le $middle and $middle le $right) if ($left le $right); return ($left ge $middle and $middle ge $right) if ($left ge $right); 0; } 1;

On Wed, May 6, 2020 at 10:16 AM Matthew O. Persico notifications@github.com wrote:

funcval() is between funcleft() and funcright()

meaning that you "naturally" want to refer to the value once. If I have to worry about double eval of the thing I'm checking, I see no need to use the new syntax. And if you want the double eval, you can drop back to the current syntax with two compares joined by an and.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692#issuecomment-624711149, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKCIPKTMYXSRLS6EEWBOR3RQF5MZANCNFSM4L5U4YIQ .

-- With great risk comes spectacular failure

Grinnz commented 4 years ago

@matthewpersico All of this is already specified to be true in the documentation.

matthewpersico commented 4 years ago

@Grinnz - quite frankly, I can't really tell if @briandfoy 's original concerns are being addressed or he code is being left as is from this long and twisty thread.

Update: I missed https://github.com/Perl/perl5/issues/17692#issuecomment-624474828

But tie is still going to be eval'ed multiple times potentially?

nrdvana commented 4 years ago

If I see

EXPR1 < EXPR2 < EXPR3 < EXPR4

in a language, I expect it to be semantically equivalent to

tmp1= EXPR1
tmp2= EXPR2
if (tmp1 < tmp2) {
  tmp3= EXPR3
  if (tmp2 < tmp3) {
    tmp4= EXPR4
    if (tmp3 < tmp4) {

because this is the implementation that results in the least processing, and it is actually a pain to write so the syntax shortcut is of the most benefit.

(edit) In terms of Perl, I would expect the temporaries to be scalars (possibly a ref to an object, but never directly tied scalars) and that all overloaded operators would happen according to the extrapolated code above.

Grinnz commented 4 years ago

All of this is already what happens. I suppose we can't do much about it now but anyone who sees this, please stop confusing the issue with the semantics of evaluating the expression only once, because that already the case. The question here is whether each side of the comparison should do a separate fetch if the central value is a tied variable.

briandfoy commented 4 years ago

I'm closing the issue. Sawyer made his ruling and the continued discussion hasn't been well-informed.

nrdvana commented 4 years ago

Well now I feel like my comment is being misread. If I write

my $tmp1= EXPR1

and EXPR 1 is a tied scalar, that does call fetch at that point and results in a single fetch, as per the original statement in this issue.

Grinnz commented 4 years ago

I did not see the part specifically mentioning tied variable semantics, my apologies. To respond to that, the difference here is that there isn't actually a temporary variable involved (as the documentation points out), only the same tied variable on the call stack, used in two operations. This would be a semantic explanation for it, to go with Zefram's technical explanation. Not that I have any opinion on which is the "right" answer.

xsawyerx commented 4 years ago

The fact that it can confuse people doesn't justify either experimental or change in behavior or implementation... While it can be confusing, it is correct and should stay put.

Isn't code that does not meeting user expectation by definition a bug?

This is a loaded question.

First of all, I didn't say "this code does not meet user expectations." I said, "it can confuse people." Two operative words: can and people, meaning it has the possibility of confusing and not everyone.

Secondly, languages might confuse, but as long as they maintain consistency - on some level - it isn't a bug.

To put it in other terms, Perl could have been consistent here by saying that in $x < EXPR < $y, EXPR will only be run once and values are cached for the entire comparison expression. It could also be consistent by saying that EXPR is always evaluated from scratched. We assume this is a pendulum where it should be either here or there. They are certainly both consistent. However, consistency here is maintained elsewhere. The comparison is done on an SV. A subroutine will need to be executed and the SV it returns is used. Once you understand this compares scalars, it makes sense. (And yes, magic attacks again, but in a consistent and deterministic way.)

Consistency usually breaks perception. Sometimes one user and sometimes the other. This can be frustrating to either user, so it should be documented well. Not just the feature, but in which way is the feature consistent.

The example of @x = sort @x is still a good example of unexpected consistency (to at least some). We had what we considered consistent, but we had to deal with a developer who objected and said the consistency was in what we considered an optimization with a bug.

Your initial position of agreement with the change is sort of indicative of that fact.

It is not. I sometimes err in initial judgment. I consider it a human quality, though I wish I never made mistakes.

Not trying to be inflammatory but I would hope the opinion of a prominent author would carry enough weight to be considered at least experimental. Do not see the harm in doing that.

No offense was taken. I hope I addressed your comments fairly.

FWIW, all of the authors I know on this thread are prominent in one way or the other. I appreciate and respect each of them and their positions and opinions. This is why this thread had me going back to the code, to the implementation, original thread, and to other core developers to discuss this again and again and again.

I've tried explaining why I disagree with the experimental status in a different comment. It sums up to two reasons that go together:

Feature flags - especially experimental ones - are not free. They come at a price to the user and we should only consider them when apt. In this case, I don't see it.

xsawyerx commented 4 years ago

tie is the red herring of this discussion, and if I had a time machine I'd go back to restate the issue. Forget about the tie. I've lost that fight and I'd be surprised if anyone is ever bit by it. But the docs still describe a situation that doesn't reflect reality.

As noted in the reddit thread, people are confused by the statement that the chained comparison is actually broken into two comparisons (a < b && b < c) and that statement is not true if b is a subroutine call. The actual Perl code is more complicated than that and something akin to this pseudocode:

 { temp = b; a < temp && temp < c }

Document it like that much of the problem goes away because there's not an exception for the subroutine case.

This is a fair point. I'll review the docs again tomorrow morning.

Grinnz commented 4 years ago

That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity

khwilliamson commented 4 years ago

On 5/6/20 4:24 PM, Dan Book wrote:

That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Perl/perl5/issues/17692#issuecomment-624922831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2DH46DJ6UST2HOZQMH2TRQHPSTANCNFSM4L5U4YIQ.

I was the one who added the commit for this feature to blead:

commit 02b85d3dab092d678cfc958a2dc252405333ed25 Author: Zefram zefram@fysh.org AuthorDate: Wed Feb 5 07:43:14 2020 +0000 Commit: Karl Williamson khw@cpan.org CommitDate: Thu Mar 12 22:34:26 2020 -0600

I did not know at the time that this would turn out to be a contentious feature. But it has. The freeze date for contentious additions was January 20. The PR also dates to after the freeze date.

That means that this commit violates our long-standing policy, and doesn't belong in 5.32.

jstaursky commented 4 years ago

@xsawyerx apologies on the bluntness of my initial message, it was early and now that I've come back to read it--should have phrased it better. Anyway thanks for addressing the comment despite the initial loaded question.

briandfoy commented 4 years ago

That is already how it's documented. https://perldoc.pl/blead/perlop#Operator-Precedence-and-Associativity

No, it's not documented like that. Instead of a general case, it starts off with the common case and presents a rule, then states a slightly less common case that uses a different rule, and then a rare case that doesn't seem to follow either rule. There's a way to unify those.

The documentation starts as being two comparisons in the simple scalar case. That sets the frame for everything that follows. At that point, you have already told people how to think about this feature and it's the foundation for their interpretation of the following paragraphs. After that, it starts to list other cases and other behavior, all of which are framed as modifications to the initial case. Many people (as evidenced by the the reddit thread and this thread), aren't going to keep reading, and even if they did, they won't understand what they are reading or why it matters to their life.

I think if we change the frame by documenting it the same way for all cases, we lose several paragraphs of exceptions and explanations. That is, for all cases, it's like { temp = b; a < temp && temp < c }, even if the particular implementation doesn't match that. Now, the subroutine and scalar case are the same rule, and it's easy to explain the tie because temp is still the tied object. Now the tie and overload cases are much easier to grok.

davidnicol commented 4 years ago

I think if we change the frame by documenting it the same way for all cases, we lose several paragraphs of exceptions and explanations. That is, for all cases, it's like { temp = b; a < temp && temp < c }, even if the particular implementation doesn't match that. Now, the subroutine and scalar case are the

A chained comparison a < b < c essentially works like do { temp = b; a < temp && temp < c }. When b is a magical overloaded or tied value, the second fetch magic may be avoided by adding zero, for numeric comparisons, or quoting, for string comparisons, the middle value, so temp will be a plain r-value.

-- With great risk comes spectacular failure