Perl / perl5

🐪 The Perl programming language
https://dev.perl.org/perl5/
Other
1.88k stars 530 forks source link

sort in scalar context is undefined #12803

Open p5pRT opened 11 years ago

p5pRT commented 11 years ago

Migrated from rt.perl.org#116877 (status was 'open')

Searchable as RT116877$

p5pRT commented 11 years ago

From @demerphq

sort is documented as having undefined behavior in scalar context.

This is really annoying as code like​:

  return keys %hash;

cannot be safely changed to

  return sort keys %hash;

as if it is used in scalar context it wont return the number of items in the list. One must instead do

  my @​temporary= sort keys %hash;   return @​temporary;

IMO this is silly. Why shouldnt sort() become a no-op in this case and return the number of items in the list? Undefined behavior in this case seems like the worst of all the possible choices. Is there a reason for this?

This bug affects all modern perls as far as I know\, including perl 5.17.9. I am including a -V to appease the perlbug gods\, but it is not version specific.

Cheers\, Yves

Summary of my perl5 (revision 5 version 17 subversion 9) configuration​:   Derived from​: e2183dd1583911a709201025e7d0020fde8b1862   Platform​:   osname=linux\, osvers=3.0.0-12-generic\, archname=x86_64-linux   uname='linux yam 3.0.0-12-generic #20-ubuntu smp fri oct 7 14​:56​:25 utc 2011 x86_64 x86_64 x86_64 gnulinux '   config_args='-Doptimize=-g -d -Dusedevel -Dcc=ccache gcc -Dld=gcc -DDEBUGGING'   hint=recommended\, useposix=true\, d_sigaction=define   useithreads=undef\, usemultiplicity=undef   useperlio=define\, d_sfio=undef\, uselargefiles=define\, usesocks=undef   use64bitint=define\, use64bitall=define\, uselongdouble=undef   usemymalloc=n\, bincompat5005=undef   Compiler​:   cc='ccache gcc'\, ccflags ='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'\,   optimize='-g'\,   cppflags='-DDEBUGGING -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'   ccversion=''\, gccversion='4.6.1'\, gccosandvers=''   intsize=4\, longsize=8\, ptrsize=8\, doublesize=8\, byteorder=12345678   d_longlong=define\, longlongsize=8\, d_longdbl=define\, longdblsize=16   ivtype='long'\, ivsize=8\, nvtype='double'\, nvsize=8\, Off_t='off_t'\, lseeksize=8   alignbytes=8\, prototype=define   Linker and Libraries​:   ld='gcc'\, ldflags =' -fstack-protector -L/usr/local/lib'   libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib   libs=-lnsl -ldl -lm -lcrypt -lutil -lc   perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc   libc=\, so=so\, useshrplib=false\, libperl=libperl.a   gnulibc_version='2.13'   Dynamic Linking​:   dlsrc=dl_dlopen.xs\, dlext=so\, d_dlsymun=undef\, ccdlflags='-Wl\,-E'   cccdlflags='-fPIC'\, lddlflags='-shared -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: DEBUGGING HAS_TIMES PERLIO_LAYERS   PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP   PERL_NEW_COPY_ON_WRITE PERL_PRESERVE_IVUV   PERL_USE_DEVEL USE_64_BIT_ALL USE_64_BIT_INT   USE_LARGE_FILES USE_LOCALE USE_LOCALE_COLLATE   USE_LOCALE_CTYPE USE_LOCALE_NUMERIC USE_PERLIO   USE_PERL_ATOF   Locally applied patches​:   uncommitted-changes   Built under linux   Compiled at Feb 20 2013 06​:10​:44   @​INC​:   lib   /usr/local/lib/perl5/site_perl/5.17.9/x86_64-linux   /usr/local/lib/perl5/site_perl/5.17.9   /usr/local/lib/perl5/5.17.9/x86_64-linux   /usr/local/lib/perl5/5.17.9   .

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @tsee

On 02/20/2013 07​:06 AM\, yves orton (via RT) wrote​:

# Please include the string​: [perl #116877] # in the subject line of all future correspondence about this issue. # \<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=116877 >

sort is documented as having undefined behavior in scalar context.

This is really annoying as code like​:

return keys %hash;

cannot be safely changed to

return sort keys %hash;

as if it is used in scalar context it wont return the number of items in the list. One must instead do

my @&#8203;temporary= sort keys %hash;
return @&#8203;temporary;

IMO this is silly. Why shouldnt sort() become a no-op in this case and return the number of items in the list? Undefined behavior in this case seems like the worst of all the possible choices. Is there a reason for this?

This bug affects all modern perls as far as I know\, including perl 5.17.9. I am including a -V to appease the perlbug gods\, but it is not version specific.

But this is just a documentation bug\, right?

Using some random blead from after 5.17.7 and before 5.17.8​:

perl -e 'use warnings; use strict; sub foo {my %h = 1..10; return sort keys %h} foo()'

This does not warn. But this does​:

perl -e 'use warnings; use strict; sort qw(1..10); my $x = sort qw(1..10)' Useless use of sort in scalar context at -e line 1. Useless use of sort in void context at -e line 1.

Is the bug report here specifically about documenting (and implementing) that sort becomes a noop in void context? Since the example you gave doesn't warn\, can you come up with any other valid cases where that behaviour makes sense?

If so\, I guess you could modify the docs and then do an op_null() in the respective OP_SORT case in Perl_scalarvoid in op.c? Dito for Perl_scalar\, same file.

--Steffen

PS​: Yes\, sort{} blocks could have side effects. That's already undefined behaviour AFAIK.

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @demerphq

On 20 February 2013 07​:49\, Steffen Mueller \smueller@&#8203;cpan\.org wrote​:

On 02/20/2013 07​:06 AM\, yves orton (via RT) wrote​:

# Please include the string​: [perl #116877] # in the subject line of all future correspondence about this issue. # \<URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=116877 >

sort is documented as having undefined behavior in scalar context.

This is really annoying as code like​:

return keys %hash;

cannot be safely changed to

return sort keys %hash;

as if it is used in scalar context it wont return the number of items in the list. One must instead do

my @&#8203;temporary= sort keys %hash;
return @&#8203;temporary;

IMO this is silly. Why shouldnt sort() become a no-op in this case and return the number of items in the list? Undefined behavior in this case seems like the worst of all the possible choices. Is there a reason for this?

This bug affects all modern perls as far as I know\, including perl 5.17.9. I am including a -V to appease the perlbug gods\, but it is not version specific.

But this is just a documentation bug\, right?

No. I think that sort having undefined behavior in *scalar* context is a bug.

It should return the number of items in the list just like keys\, or values\, or map\, or grep\, or well\, pretty much every list operator does in scalar context.

Why should sort be different?

perl -e 'use warnings; use strict; sub foo {my %h = 1..10; return sort keys %h} foo()'

My point is the following is stupid​:

$ ./perl -Ilib -le'use warnings; use strict; sub foo {my %h = 1..10; return sort keys %h} print foo(); print scalar foo();' 13579 Use of uninitialized value in print at -e line 1.

Why should it return the most useless possible result of undef? Why cant it return 5 like any other list op?

This does not warn. But this does​:

perl -e 'use warnings; use strict; sort qw(1..10); my $x = sort qw(1..10)' Useless use of sort in scalar context at -e line 1. Useless use of sort in void context at -e line 1.

Is the bug report here specifically about documenting (and implementing) that sort becomes a noop in void context? Since the example you gave doesn't warn\, can you come up with any other valid cases where that behaviour makes sense?

Please drink some coffee and reread my OP :-).

I said *scalar* context. The behavior of sort in void context is an entirely separate discussion\, which I have minimal opinion on.

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @Smylers

yves orton writes​:

This is really annoying as code like​:

return keys %hash;

cannot be safely changed to

return sort keys %hash;

as if it is used in scalar context it wont return the number of items in the list.

Why shouldnt sort() become a no-op in this case and return the number of items in the list?

That's a reasonable argument for sort in scalar context being a no-op.

However\, returning the number of items in the list only serves as a no-op in this case because that's what keys does in scalar context; it wouldn't be a no-op for arguments that themselves do something else in scalar context. As a silly example\, it wouldn't be a no-op in scalar context turning​:

  return localtime;

into​:

  return sort localtime;

So maybe a better no-op behaviour would be for sort in scalar context to propagate that scalar context to its input?

In your example that would put keys in scalar context\, which would do the counting\, and sort would simply pass through that count.

Cheers

Smylers -- http​://twitter.com/Smylers2

p5pRT commented 11 years ago

From @demerphq

On 20 February 2013 12​:16\, Smylers \Smylers@&#8203;stripey\.com wrote​:

yves orton writes​:

This is really annoying as code like​:

return keys %hash;

cannot be safely changed to

return sort keys %hash;

as if it is used in scalar context it wont return the number of items in the list.

Why shouldnt sort() become a no-op in this case and return the number of items in the list?

That's a reasonable argument for sort in scalar context being a no-op.

However\, returning the number of items in the list only serves as a no-op in this case because that's what keys does in scalar context; it wouldn't be a no-op for arguments that themselves do something else in scalar context. As a silly example\, it wouldn't be a no-op in scalar context turning​:

return localtime;

into​:

return sort localtime;

So maybe a better no-op behaviour would be for sort in scalar context to propagate that scalar context to its input?

In your example that would put keys in scalar context\, which would do the counting\, and sort would simply pass through that count.

I like that idea. If its doable then I think its worth doing.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @tamias

On Wed\, Feb 20\, 2013 at 11​:16​:19AM +0000\, Smylers wrote​:

So maybe a better no-op behaviour would be for sort in scalar context to propagate that scalar context to its input?

In your example that would put keys in scalar context\, which would do the counting\, and sort would simply pass through that count.

Note that under that implementation\, C\<\< scalar sort(4\, 3\, 2\, 1) >> would return 1.

Ronald

p5pRT commented 11 years ago

From Eirik-Berg.Hanssen@allverden.no

On Wed\, Feb 20\, 2013 at 10​:17 PM\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Wed\, Feb 20\, 2013 at 11​:16​:19AM +0000\, Smylers wrote​:

So maybe a better no-op behaviour would be for sort in scalar context to propagate that scalar context to its input?

In your example that would put keys in scalar context\, which would do the counting\, and sort would simply pass through that count.

Note that under that implementation\, C\<\< scalar sort(4\, 3\, 2\, 1) >> would return 1.

  ... which is exactly what it would return without the sort\, and no more useless than the undef it is currently returning\, so I'm not too worried about that.

  More generally​:

  For the case when a C\<\< return EXPR >> is amended to a C\<\< return sort EXPR >>\, this proposed change would prevent unintended change of the return value in scalar context.

  That sounds like a good thing.

  The current undef seems remarkably useless – even as a usage error indicator\, it seems flimsy and half-hearted ("you're not supposed to call me in scalar context\, but instead of complaining\, I'll just return an undef") – so I don't see that we lose much either.

  But I guess it's better to keep the current warning (if not necessarily semantics) when the sort can be determined to be in scalar context at compile time. SWYM\, right? :)

Eirik

p5pRT commented 11 years ago

From @xdg

I would prefer it to act exactly like map would\, since it's a list transformation. Possibly\, in scalar context\, it should probably just get optimized to an empty map op.

Having it force scalar context on the input is a bad idea​:

  return sort $obj->method();

What if method uses wantarray to determine if a list or single value is wanted? You still want it to provide the list\, sort should pass it through like an empty map\, and the return value will be the count of items returned from the method.

While I'm fine with it not actually bothering to sort in scalar context\, that should be clearly documented in case someone crazily wants to put side effects into their sort block and then is surprised if they don't run.

David

p5pRT commented 11 years ago

From @demerphq

On 21 February 2013 00​:00\, David Golden \xdg@&#8203;xdg\.me wrote​:

While I'm fine with it not actually bothering to sort in scalar context\, that should be clearly documented in case someone crazily wants to put side effects into their sort block and then is surprised if they don't run.

I think we could warn if sort is called in scalar context in block form.

Yves -- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @xdg

On Thu\, Feb 21\, 2013 at 2​:47 AM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 21 February 2013 00​:00\, David Golden \xdg@&#8203;xdg\.me wrote​:

While I'm fine with it not actually bothering to sort in scalar context\, that should be clearly documented in case someone crazily wants to put side effects into their sort block and then is surprised if they don't run.

I think we could warn if sort is called in scalar context in block form.

Why? If I'm offering a list of values in response to a function call\, why do I care what my caller uses it for? And why should their change in context throw warnings in my code?

E.g. I have a module that does "return sort { ... } stuff()" -- having that line warn at runtime when someone calls my sub in scalar context seems perverse\, since 99% of the time\, the block will have no side effects and the warning would be a false positive.

If we're really concerned about it\, we shouldn't optimize away the sort and let people do that themselves with wantarray.

But I don't think we should be that concerned about it. A sort block with side effects seems bizarre\, even if theoretically possible.

David

-- David Golden \xdg@&#8203;xdg\.me Take back your inbox! → http​://www.bunchmail.com/ Twitter/IRC​: @​xdg

p5pRT commented 11 years ago

From @b2gills

On Thu\, Feb 21\, 2013 at 10​:47 AM\, David Golden \xdg@&#8203;xdg\.me wrote​:

On Thu\, Feb 21\, 2013 at 2​:47 AM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 21 February 2013 00​:00\, David Golden \xdg@&#8203;xdg\.me wrote​:

While I'm fine with it not actually bothering to sort in scalar context\, that should be clearly documented in case someone crazily wants to put side effects into their sort block and then is surprised if they don't run.

I think we could warn if sort is called in scalar context in block form.

Why? If I'm offering a list of values in response to a function call\, why do I care what my caller uses it for? And why should their change in context throw warnings in my code?

E.g. I have a module that does "return sort { ... } stuff()" -- having that line warn at runtime when someone calls my sub in scalar context seems perverse\, since 99% of the time\, the block will have no side effects and the warning would be a false positive.

If we're really concerned about it\, we shouldn't optimize away the sort and let people do that themselves with wantarray.

But I don't think we should be that concerned about it. A sort block with side effects seems bizarre\, even if theoretically possible.

David

I totally agree with David.

I don't see why these aren't (largely) equivalent

  $x = sort @​y;   $x =()= sort @​y;

I think the first one should be optimized to just​:

  $x = @​y;

Then by extension these should work the same​:

  sub x{   return sort @​_;   }   sub x{   return scalar @​_ unless wantarray;   return sort @​_;   }

If someone needs the side effects of calling sort they should have to ask for it. Since that is not what you normally want.

  sub x{   return my @​sideeffect = sort @​_;   }

  # Or

  sub x{ return sort @​_ }

  # my $x = x();   my $x =()= x();

  # say scalar( x() );   say my $workaround =()= x();

(Oddly enough these are also work-arounds for the current situation)

p5pRT commented 11 years ago

From @bulk88

Brad Gilbert wrote​:

I don't see why these aren't (largely) equivalent

$x = sort @&#8203;y;
$x =\(\)= sort @&#8203;y;

I think the first one should be optimized to just​:

$x = @&#8203;y;

Then by extension these should work the same​:

sub x\{
  return sort @&#8203;\_;
\}
sub x\{
  return scalar @&#8203;\_ unless wantarray;
  return sort @&#8203;\_;
\}

If someone needs the side effects of calling sort they should have to ask for it. Since that is not what you normally want.

sub x\{
  return my @&#8203;sideeffect = sort @&#8203;\_;
\}

\# Or

sub x\{ return sort @&#8203;\_ \}

\# my $x = x\(\);
my $x =\(\)= x\(\);

\# say scalar\( x\(\) \);
say my $workaround =\(\)= x\(\);

(Oddly enough these are also work-arounds for the current situation)

I think this optimization is useful. Especially for less than perfectly written code. It would speed up "old" Perl code. I wonder if someone hooked pp_sort to crash to C debugger and ran harness\, would it catch any sort in scalar context usage.

p5pRT commented 11 years ago

From @b2gills

On Thu\, Feb 21\, 2013 at 12​:51 PM\, bulk88 \bulk88@&#8203;hotmail\.com wrote​:

Brad Gilbert wrote​:

I don't see why these aren't (largely) equivalent

$x = sort @&#8203;y;
$x =\(\)= sort @&#8203;y;

I think the first one should be optimized to just​:

$x = @&#8203;y;

Then by extension these should work the same​:

sub x\{
  return sort @&#8203;\_;
\}
sub x\{
  return scalar @&#8203;\_ unless wantarray;
  return sort @&#8203;\_;
\}

If someone needs the side effects of calling sort they should have to ask for it. Since that is not what you normally want.

sub x\{
  return my @&#8203;sideeffect = sort @&#8203;\_;
\}

\# Or

sub x\{ return sort @&#8203;\_ \}

\# my $x = x\(\);
my $x =\(\)= x\(\);

\# say scalar\( x\(\) \);
say my $workaround =\(\)= x\(\);

(Oddly enough these are also work-arounds for the current situation)

I think this optimization is useful. Especially for less than perfectly written code. It would speed up "old" Perl code. I wonder if someone hooked pp_sort to crash to C debugger and ran harness\, would it catch any sort in scalar context usage.

Actually the way it currently works is more like

  sub sort(...){   return undef unless wantarray;   ...   }

Which should be slightly faster than what I am vying for​:

  sub sort(...){   return scalar @​_ unless wantarray;   ...   }

So it is not actually speeding up old code. It is making old code (arguably) more correct.

Any code that is successfully working around the current behaviour will continue to work just as fast\, or slow as it currently does.

p5pRT commented 11 years ago

From @demerphq

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

  1) return the number of items   2) become "invisible" and return whatever the input to sort would return if it were called in scalar context.   3) return the minimum

For me​:

Either of the 1) or 2) make sense to me. I lean a little towards 2) as being the least likely to /add/ confusion to already an confusing set of rules of understanding what "return LISTYTHING" does. Regardless IMO it is important that​:

  return @​array;   return sort @​array;

and

  return keys %hash;   return keys sort %hash;

and

  return grep { /pat/ } (LIST);   return sort grep { /pat/ } (LIST);

and

  return map { "x" . $_ } (LIST);   return sort map { "x" . $_ } (LIST);

all return the same thing in list and scalar context. I am less concerned that

  return $x\, $y\, $z;   return sort $x\, $y\, $z

behave the same in list and scalar context\, so I am fine with going with 1\, but I am inclined to say that 2 is marginally preferable as it doesn't add any new special cases.

The third option is the most surprising\, and IMO the least useful as we can already do it with list slice and be clearer as to what it means\, and IMO much worse it requires we execute the sort\, at least when there is a comparison block and in practical terms probably always. I personally think that any effort in this direction would be better spent in making Perl optimize slices on sort\, so that the optimisation covered (sort @​list)[0..2] as well as (sort @​list)[0].

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @b2gills

On Thu\, Feb 21\, 2013 at 2​:20 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

1) return the number of items 2) become "invisible" and return whatever the input to sort would return if it were called in scalar context. 3) return the minimum

For me​:

Either of the 1) or 2) make sense to me. I lean a little towards 2) as being the least likely to /add/ confusion to already an confusing set of rules of understanding what "return LISTYTHING" does. Regardless IMO it is important that​:

return @​array; return sort @​array;

and

return keys %hash; return keys sort %hash;

and

return grep { /pat/ } (LIST); return sort grep { /pat/ } (LIST);

and

return map { "x" . $_ } (LIST); return sort map { "x" . $_ } (LIST);

all return the same thing in list and scalar context. I am less concerned that

return $x\, $y\, $z; return sort $x\, $y\, $z

behave the same in list and scalar context\, so I am fine with going with 1\, but I am inclined to say that 2 is marginally preferable as it doesn't add any new special cases.

The third option is the most surprising\, and IMO the least useful as we can already do it with list slice and be clearer as to what it means\, and IMO much worse it requires we execute the sort\, at least when there is a comparison block and in practical terms probably always. I personally think that any effort in this direction would be better spent in making Perl optimize slices on sort\, so that the optimisation covered (sort @​list)[0..2] as well as (sort @​list)[0].

Yves

My recommendation is 1)


I think it would be better if it were 1) since it is more self consistent.

  $x = sort qw" d a b c ";   $y = sub{sort @​_}->(qw" d a b c ");   $z = sub{ @​a = sort @​_; @​a }->(qw" d a b c " );

1)

  $x == 4   $y == 4   $z == 4

2)

  $x eq 'c' # \<= WHAT??!!   $y == 4   $z == 4

3)

  $x eq 'a'   $y eq 'a'   $z == 4 # can be worked around

With 2) it would be impossible to implement a subroutine that works anything like `sort()` without some level of C or XS.

Since the return value depends on whether it is given a list\, or an array. ( Normal subroutines only ever get an array )

I can see where 2) would be easier to optimize in some cases (which IMHO were badly written). It isn't anywhere as consistent as 1) or even 3).

I do see some appeal to implementing 3) so that `$x=sort...` does something useful. (You could have written it as `$x=(sort...)[0]` if that is what you meant) It is inconsistent with other keywords though

  $ perl -E'say scalar keys %​::'   45

So then I can only recommend 1)


I can also see this generating a warning.

  $x = sort qw' d a b c ';

Since you could write it more specifically.

1)   $x =()= qw' d a b c '; 2)   $x = qw' d a b c '; 3)   $x = (sort qw' d a b c ')[0];


These should work without a warning​:

  $x = sub{ sort qw' d a b c ' }->();   $x = sub{ sort @​_ }->(qw' d a b c ');

I also think they should return the same thing 1) or 3)


My recommendation is 1) return the number of items It's the most consistent\, with it-self and with the rest of the language.

Far second choice is 3) return minimum. ( I am recommending against it though as it's not consistent with the rest of the language )

I am completely against 2)\, as it's not even self-consistent. ( It changes when it is returned from a sub )

p5pRT commented 11 years ago

From @demerphq

On 22 February 2013 01​:07\, Gisle Aas \gisle@&#8203;activestate\.com wrote​:

On Thu\, Feb 21\, 2013 at 9​:20 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

1) return the number of items 2) become "invisible" and return whatever the input to sort would return if it were called in scalar context. 3) return the minimum

I checked how the listy functions in perl works in scalar context and found this​:

splice returns the last value delete returns the last value qw returns the last value reverse concatenates the arguments as a single string and reverses that times returns the first value unpack returns the first value localtime returns a string gmtime returns a string stat returns a boolean (success) m returns a boolean sort returns undef split length of list grep length of list map length of list keys length of list values length of list eval propagate context

Option 1) above will make sort behave like split\, grep\, and keys. Option 2) makes it behave like eval. Option 3) makes it behave like times\, and unpack.

For me​:

Either of the 1) or 2) make sense to me. I lean a little towards 2) as being the least likely to /add/ confusion to already an confusing set of rules of understanding what "return LISTYTHING" does.

You basically need to read the docs if you care about the scalar value of LISTYTHING. My logic is to expect Perl to try do the most useful thing\, not try to be consistent. I still find 3) the most useful option for sort.

The third option is the most surprising\, and IMO the least useful as we can already do it with list slice and be clearer as to what it means\, and IMO much worse it requires we execute the sort\, at least when there is a comparison block and in practical terms probably always.

sort can detect the scalar context and find the smallest element in linear time. No need to do a full sort just to find the first element. In void context it can skip the sort completely.

But then we have the optimization for exactly one case\, instead of a solution based around optimizing slices which could optimize many.

Also consider the complexity involved in making this work​:

my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do. So IMO this is not the right choice. It is not huffman encoded properly\, and is dangerous as it means you cant stick a sort into a return statement without changing the semantics of the function in scalar context.

To me\, making sort in scalar context do something that can already be done\, and making sort dangerous at the same time\, is a bad call no matter how convenient it might be for a small set of use cases.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @demerphq

On 21 February 2013 21​:20\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

1) return the number of items

I have pushed a patch to implement this on the yves/scalar_sort topic branch. I have requested it be smoked as smoke-me/yves-scalar_sort.

The commit currently is​:

commit 59cf203f810b52d40552961e7fbb1cab9cdde22a Author​: Yves Orton \demerphq@&#8203;gmail\.com Date​: Fri Feb 22 14​:03​:22 2013 +0100

  make sort in scalar context return the number of elements

  This allows one to add a "sort" to a routine that returned an array\,   or list of keys\, and preserve the original meaning in scalar context\,   but cause the returned values to be in sorted order.

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @xdg

On Fri\, Feb 22\, 2013 at 8​:13 AM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 21 February 2013 21​:20\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

1) return the number of items

I have pushed a patch to implement this on the yves/scalar_sort topic branch. I have requested it be smoked as smoke-me/yves-scalar_sort.

+1

I think sort is most similar to map and grep in that they are all list transformers. Therefore\, they should all work the same in scalar context.

-- David Golden \xdg@&#8203;xdg\.me Take back your inbox! → http​://www.bunchmail.com/ Twitter/IRC​: @​xdg

p5pRT commented 11 years ago

From @tamias

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​:

my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do.

Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) {   if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) {   $min = $item;   } }

Ronald

p5pRT commented 11 years ago

From @demerphq

On 22 February 2013 17​:29\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​:

my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do.

Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) { if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) { $min = $item; } }

Yeah\, true\, still\, its not "simply scanning for the minimum" anymore is it?

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From j.imrie1@virginmedia.com

On 22/02/2013 16​:43\, demerphq wrote​:

On 22 February 2013 17​:29\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​:

my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do. Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) { if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) { $min = $item; } } Yeah\, true\, still\, its not "simply scanning for the minimum" anymore is it?

Yves

use List​::Util qw(min); $min = min @​list;

p5pRT commented 11 years ago

From @rjbs

* demerphq \demerphq@&#8203;gmail\.com [2013-02-22T08​:13​:36]

On 21 February 2013 21​:20\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 20 February 2013 07​:06\, yves orton \perlbug\-followup@&#8203;perl\.org wrote​:

sort is documented as having undefined behavior in scalar context.

This is really annoying

So far the options proposed seem to be​:

1) return the number of items

I have pushed a patch to implement this on the yves/scalar_sort topic branch. I have requested it be smoked as smoke-me/yves-scalar_sort.

I am at least tentatively in favor of this\, if not just plain ol' in favor.

...but I'd like it to wait for 5.19.0 to open. We've already got a number of non-trivial things likely to happen between now and 5.18.0\, and I'd like to keep the "things changed in short period right before release" list as small as possible.

Thanks for working on this!

-- rjbs

p5pRT commented 11 years ago

From @druud62

On 2013-02-22 23​:34\, John Imrie wrote​:

On 22/02/2013 16​:43\, demerphq wrote​:

On 22 February 2013 17​:29\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​: my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do.

Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) { if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) { $min = $item; } }

Yeah\, true\, still\, its not "simply scanning for the minimum" anymore is it?

use List​::Util qw(min); $min = min @​list;

John\, I think that you missed the overriding of 'min'.

List​::Util​::min returns the lowest numerical value.

-- Ruud

$ perl -MList​::Util=min\,reduce -wle '   my @​data = ( [ a => 0 ]\, [ x => 42 ]\, [ x => 2 ] );   my @​sorted = sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​data;   my $min = reduce { ( ( $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] ) \< 0 ) ? $a : $b } @​data;   print join " => "\, @​$_ for $sorted[0]\, min(@​data)\, $min; ' x => 2 a => 0 x => 2

p5pRT commented 11 years ago

From j.imrie1@virginmedia.com

On 24/02/2013 12​:11\, Dr.Ruud wrote​:

On 2013-02-22 23​:34\, John Imrie wrote​:

On 22/02/2013 16​:43\, demerphq wrote​:

On 22 February 2013 17​:29\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​: my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do.

Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) { if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) { $min = $item; } }

Yeah\, true\, still\, its not "simply scanning for the minimum" anymore is it?

use List​::Util qw(min); $min = min @​list;

John\, I think that you missed the overriding of 'min'.

List​::Util​::min returns the lowest numerical value.

Argh!!

Yes you are right. Sorry

We need a string version of min then.

John

p5pRT commented 11 years ago

From @Smylers

John Imrie writes​:

On 24/02/2013 12​:11\, Dr.Ruud wrote​:

List​::Util​::min returns the lowest numerical value.

Yes you are right. We need a string version of min then.

List​::Util​::minstr

Smylers -- http​://twitter.com/Smylers2

p5pRT commented 11 years ago

From @druud62

On 2013-02-24 14​:22\, John Imrie wrote​:

On 24/02/2013 12​:11\, Dr.Ruud wrote​:

On 2013-02-22 23​:34\, John Imrie wrote​:

On 22/02/2013 16​:43\, demerphq wrote​:

On 22 February 2013 17​:29\, Ronald J Kimball \rjk@&#8203;tamias\.net wrote​:

On Fri\, Feb 22\, 2013 at 08​:36​:41AM +0100\, demerphq wrote​:

Also consider the complexity involved in making this work​: my $min= sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array;

We cant simply scan the list to find the minimum\, we will have to apply the sort function and then extract item 0\, which is exactly what​:

my $min= (sort { $b->[0] cmp $a->[0] || $a->[1] \<=> $b->[1] } @​array)[0];

would do.

Of course we can simply scan the list to find the minimum.

my $min = $array[0];

for my $item (@​array[1 .. $#array]) { if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) { $min = $item; } }

Yeah\, true\, still\, its not "simply scanning for the minimum" anymore is it?

use List​::Util qw(min); $min = min @​list;

John\, I think that you missed the overriding of 'min'.

List​::Util​::min returns the lowest numerical value.

Argh!!

Yes you are right. Sorry

We need a string version of min then.

Its 'minstr' can not cut this cake either.

Check the reduce-example in my previous message. (in the sig-area)

-- Ruud

p5pRT commented 11 years ago

From @jandubois

On Thu\, Feb 21\, 2013 at 11​:36 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

To me\, making sort in scalar context do something that can already be done\, and making sort dangerous at the same time\, is a bad call no matter how convenient it might be for a small set of use cases.

I don't understand the "dangerous" part of your comment\, unless you mean​: "does not do what *I* want it to do".

The only dangerous part is assuming that a function normally returning a list will in scalar context return the number of elements in that list.

Beyond that\, both the "return the number" and "return the first" is already trivially implementable\, so to be this is really a case of "six of one\, half a dozen of the other".

One thing slightly more interesting about the first element case is that we already optimize "reverse sort"\, so this would also give an optimized​:

  my $max = reverse sort @​list;

that could work in linear time.

Cheers\, -Jan

p5pRT commented 11 years ago

From @demerphq

On 25 February 2013 19​:57\, Jan Dubois \jand@&#8203;activestate\.com wrote​:

On Thu\, Feb 21\, 2013 at 11​:36 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

To me\, making sort in scalar context do something that can already be done\, and making sort dangerous at the same time\, is a bad call no matter how convenient it might be for a small set of use cases.

I don't understand the "dangerous" part of your comment\, unless you mean​: "does not do what *I* want it to do".

Well I suppose this is sufficiently a matter of taste that probably I am not going to convince you\, but I think its dangerous because it will not do what I think most beginners would expect. I think most beginners would expect that when starting with​:

sub list_of_keys {   return keys %hash; }

if (list_of_keys()) {   ... }

that adding "sort" to the return of list_of_keys()​:

sub list_of_keys {   return sort keys %hash; }

wouldn't break the code that does if (list_of_keys()) { ... }.

Returning the max/min means that every place that I want to do​:

  return sort thingee();

I have to do

  return wantarray ? sort thingee() : thingee();

which is just PITA and poorly huffman encoded.

The only dangerous part is assuming that a function normally returning a list will in scalar context return the number of elements in that list.

That's like saying the only dangerous part of a lawnmower is the blades. Perl is supposed to be linguistic\, intuitive\, and properly huffman encoded. From that point alone I've needed to sort the return of a sub gazillions of time.

Whereas I only rarely want to do something like find the minimum of an arbitrary list of values without already having iterated over them. Most time I need that kind of min I have to iterate over the list anyway\, as such I rarely have to reach for Scalar​::Util​::min.

Beyond that\, both the "return the number" and "return the first" is already trivially implementable\, so to be this is really a case of "six of one\, half a dozen of the other".

Return the first meaning return the min/max? Or return the first item in the list?

One thing slightly more interesting about the first element case is that we already optimize "reverse sort"\, so this would also give an optimized​:

my $max = reverse sort @&#8203;list;

that could work in linear time.

Except that would break the documented interface of reverse\, so no we can't make that work.

I think a much much more interesting approach for this is to make list slices and list assignment on sort "smart".

IOW I should be able to get an optized solution to something like this​:

my ($x\,$y\,$z)= sort { ... } @​big_array;

and have perl do something clever so that it doesnt sort the full list just find the first N items.

Which btw would allow this​:

my ($x\,$y\,$z)= reverse sort { ... } @​big_array;

Which does what you want and doesn't require we change scalar behavior of reverse\, and has a built in interface for doing more interesting things than "find the min". :-)

cheers\, Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

p5pRT commented 11 years ago

From @jandubois

On Mon\, Feb 25\, 2013 at 11​:19 AM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

On 25 February 2013 19​:57\, Jan Dubois \jand@&#8203;activestate\.com wrote​:

On Thu\, Feb 21\, 2013 at 11​:36 PM\, demerphq \demerphq@&#8203;gmail\.com wrote​:

To me\, making sort in scalar context do something that can already be done\, and making sort dangerous at the same time\, is a bad call no matter how convenient it might be for a small set of use cases.

I don't understand the "dangerous" part of your comment\, unless you mean​: "does not do what *I* want it to do".

Well I suppose this is sufficiently a matter of taste that probably I am not going to convince you\, but I think its dangerous because it will not do what I think most beginners would expect. I think most beginners would expect that when starting with​:

sub list_of_keys { return keys %hash; }

if (list_of_keys()) { ... }

that adding "sort" to the return of list_of_keys()​:

sub list_of_keys { return sort keys %hash; }

wouldn't break the code that does if (list_of_keys()) { ... }.

No\, I see what you mean. Except that it doesn't work that way right now. So changing sort() from returning undef to returning anything else doesn't make it more "dangerous"\, it just doesn't make it DWIM that is optimized for return() statements.

Which is one of my concerns with this​: returning the number of elements in the list is pretty useless for the normal cases like

  my $count = sort keys %hash;   my $count = sort @​list;

The "sort" is essentially a no-op\, except for the special case inside the return() statement\, and I generally hate wasting "syntactic space" for something that doesn't provide any value\, except in a narrow special case.

Returning the max/min means that every place that I want to do​:

return sort thingee();

I have to do

return wantarray ? sort thingee() : thingee();

which is just PITA and poorly huffman encoded.

I thought you could also use

  return my @​list = sort thingee();

but I now realize that this will not optimize the sort() when called in scalar context.

The only dangerous part is assuming that a function normally returning a

list will in scalar context return the number of elements in that list.

That's like saying the only dangerous part of a lawnmower is the blades. Perl is supposed to be linguistic\, intuitive\, and properly huffman encoded. From that point alone I've needed to sort the return of a sub gazillions of time.

Whereas I only rarely want to do something like find the minimum of an arbitrary list of values without already having iterated over them. Most time I need that kind of min I have to iterate over the list anyway\, as such I rarely have to reach for Scalar​::Util​::min.

For me it is a wash\, I think I find both situations equally likely.

Beyond that\, both the "return the number" and "return the first" is already trivially implementable\, so to be this is really a case of "six of one\, half a dozen of the other".

Return the first meaning return the min/max? Or return the first item in the list?

Both

One thing slightly more interesting about the first element case is that we already optimize "reverse sort"\, so this would also give an optimized​:

my $max = reverse sort @&#8203;list;

that could work in linear time.

Except that would break the documented interface of reverse\, so no we can't make that work.

Darn. Now this makes the "return the first" option less appealing...

I think a much much more interesting approach for this is to make list slices and list assignment on sort "smart".

IOW I should be able to get an optized solution to something like this​:

my ($x\,$y\,$z)= sort { ... } @​big_array;

and have perl do something clever so that it doesnt sort the full list just find the first N items.

Yes. I suspect doing this just for N=1 covers the vast majority of use cases\, and is a lot easier to implement than any larger N\, where you could no longer have linear performance.

Which btw would allow this​:

my ($x\,$y\,$z)= reverse sort { ... } @​big_array;

Which does what you want and doesn't require we change scalar behavior of reverse\, and has a built in interface for doing more interesting things than "find the min". :-)

I definitely wouldn't want to change the defined reverse() behavior; I was just not thinking about it enough.

So I think I agree with your proposal. :)

Cheers\, -Jan

p5pRT commented 11 years ago

From @davidnicol

On Sun\, Feb 24\, 2013 at 7​:38 AM\, Smylers \Smylers@&#8203;stripey\.com wrote​:

John Imrie writes​:

On 24/02/2013 12​:11\, Dr.Ruud wrote​:

List​::Util​::min returns the lowest numerical value.

Then there's an an alternative to list utils\, which will give you the n first without sorting the whole thing

http​://search.cpan.org/~davidnico/Tie-Quicksort-Lazy-0.04/lib/Tie/Quicksort/Lazy.pm

  for my $item (@​array[1 .. $#array]) {   if (($item->[0] cmp $min->[0] || $min->[1] \<=> $item->[1]) > 0) {   $min = $item;   }   $min

would become

  use Tie​::Quicksort​::Lazy;   tie my @​tmp\, 'Tie​::Quicksort​::Lazy' \, sub { $_[0] cmp $_[1] }\, @​array;   shift @​tmp

Anyway. Everytime there's a new cat we figure out a new way to skin it.

Having once\, on something of a dare\, submitted a patch to CORE​::sort that made sort in scalar return a "sortedness ratio\," I'd prefer that sort in scalar context continue returning undef\, but start issuing a warning. If voting on the the three options presented is what we're up to (and if my vote had any weight\, which is dubious) I'm in favor of option #1\, to return the size of the unsorted input array without doing anything else.

Peace.

-- "You are now leaving the emergency airlock. Thank you for observing all safety precautions."