p5pRT commented 11 years ago

Migrated from rt.perl.org#119635 (status was 'resolved')

Searchable as RT119635$

p5pRT commented 11 years ago

From @nwc10

tl;dr: Nothing current uses C\<use sort 'quicksort';> - should we remove it?

The sort pragma was added by this commit in 5.7.x times:

commit 84d4ea48280f6b54fdc70fe4c8b9494e3331071e Author: Jarkko Hietaniemi \jhi@iki\.fi Date: Wed Nov 21 22:25:14 2001 +0000

Implement the sort pragma. Split sort code from pp_ctl.c to pp_sort.c. Includes the quicksort stabilizing layer from John P. Linderman. -Msort=qsort or -Msort=fast is faster than without (or with -Msort=mergesort or -Msort=safe) for short random inputs\, but for some reason not quite as fast as 5.6.1 qsort. More benchmarking\, profiling\, tuning\, and optimizing definitely needed.

p4raw-id: //depot/perl@13179

Part of the reason\, I think\, was that v5.6 and earlier's sort was a quicksort\, and hence not a stable sort.

By the time v5.8.0 shipped\, the only options offered were 'stable'\, '_mergesort'\, '_quicksort' and the alias '_qsort'\, with the default as mergesort (which is stable).

qw(_qsort) gets you a quicksort\, qw(_qsort stable) gets you a quicksort with a fallback comparison to guarantee stability. (Slower\, but stable)

Of the 28 distributions that mention C\\, most are in documentation:

http://grep.cpan.me/?q=%5Cbuse%5Cs%2Bsort.*%28stable%7Cqsort%7Cquicksort%7Cmergesort%7Csafe%7Cfast%29

All the code is C\<use sort 'stable'> with 3 exceptions.

1) This commented-out code in a regression test:

# uncomment to enable compatibility with perl versions prior to 5.8 #use sort '_quicksort';

2) 3 examples in Module-Pragma-0.02/example/sort.t (which is a copy of the core sort pragma from v5.10.0)

3) 2 uses in Class::Std::Containers\, which fails tests with v5.10.0 and later (and hasn't been updated to fix this)

So\,

1) nothing will break if we removed _qsort.

2) sort.pm's documentation\, from the start\, has stated this:

You can force the choice of algorithm with this pragma\, but this feels heavy-handed\, so the subpragmas beginning with a C\<_> may not persist beyond Perl 5.8.

3) We shave off some source code:

embed.fnc | 3 +- embed.h | 3 +- pp_sort.c | 866 ------------------------------------------------------------- proto.h | 7 +- 4 files changed\, 6 insertions(+)\, 873 deletions(-)

[plus a bit in lib/sort.pm\, which I didn't touch for this prototype]

4) We reduce the size of pp_sort.o by about 3K (or about 0.25% of perl)

but it's not exactly high maintenence code.

So\, should we put _qsort through a deprecation cycle\, and drop it in v5.23?

I'm not actually sure if it's worth it. Do the small gains outweigh the small churn?

Nicholas Clark

p5pRT commented 11 years ago

From @shlomif

Hi Nicholas and all\,

On Fri\, 06 Sep 2013 05:42:18 -0700 Nicholas Clark (via RT) \perlbug\-followup@perl\.org wrote:

but it's not exactly high maintenence code.

So\, should we put _qsort through a deprecation cycle\, and drop it in v5.23?

I'm OK with either way. I never used the "use sort" pragma.

Regards\,

Shlomi Fish

--

Shlomi Fish http://www.shlomifish.org/ Freecell Solver - http://fc-solve.shlomifish.org/

\<petn-randall> Writing your own nirvana may be easier than writing a good blog engine ;) — http://xrl.us/botw86

Please reply to list if it's a mailing list post - http://shlom.in/reply .

p5pRT commented 11 years ago

The RT System itself - Status changed from 'new' to 'open'

p5pRT commented 11 years ago

From @tsee

On 09/06/2013 02:42 PM\, Nicholas Clark (via RT) wrote:

1) nothing will break if we removed _qsort.

... on CPAN.

but it's not exactly high maintenence code.

It's certainly not\, nut pp_sort.c as a whole is moderately complex. Reducing the amount of code there can only be good.

So\, should we put _qsort through a deprecation cycle\, and drop it in v5.23?

I'm not actually sure if it's worth it. Do the small gains outweigh the small churn?

+1 to see it go. I'm trying not to grandly suggest that somebody (not me\, likely not Nicholas) should try implementing timsort for us now instead of the current merge-alike. :)

--Steffen

p5pRT commented 11 years ago

From @jplinderman

Part of the reason\, I think\, was that v5.6 and earlier's sort was a quicksort\, and hence not a stable sort.

Stability was definitely a feature\, but my best argument to Jarkko was that far too many common sequences (organ pipe\, sawtooth\, etc.) went quadratic\, and Peter Mcilroy's merge sort did a particularly nice job on sequences with pre-existing order (linear time on sorted input).

The only sequences (that I could think of\, anyway) where it was clear that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

My email address in pp_sort.c is shown as jpl@research.att.com. That will stop working in a year or so. Better would be jpl.jpl@gmail.com. Peter is no longer at Lucent\, so his email address\, pmcilroy@lucent.com\, probably stopped working about a decade ago. I don't have a current email address for him.

I have no strong opinion about removing qsort. Seems like a good idea to me.

p5pRT commented 11 years ago

From victor@vsespb.ru

2013/9/6 John P. Linderman \jpl\.jpl@gmail\.com

The only sequences (that I could think of\, anyway) where it was clear that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

Seems memory usage can be better with qsort than with mergesort (not sure about perl implementation).

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only n/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort

The disadvantage of the simple version above is that it requires O(n) extra storage space\, which is as bad as merge sort\<http://en.wikipedia.org/wiki/Merge_sort>. The additional memory allocations required can also drastically impact speed and cache performance in practical implementations. There is a more complex version which uses an in-place\<http://en.wikipedia.org/wiki/In-place_algorithm>partition algorithm and can achieve the complete sort using O(log n) space (not counting the input) on average (for the call stack\<http://en.wikipedia.org/wiki/Call_stack>). We start with a partition function:

p5pRT commented 11 years ago

From @jplinderman

All true\, all valid observations. The merge sort implementation uses N additional *pointers* (not records)\, the qsort implementation sorts in place (unless you want stability\, in which case it\, too\, uses another set of pointers\, same as the merge sort implementation). I started programming back in the 60's\, when a few kilobytes of memory was all that there was\, and you'd do all sorts of things that would now be considered inexcusable to save a few bytes (like using only the last two digits of the year\, leading to the Y2K "crisis"). I often have to remind myself that memory is now darn near free\, but my time is more expensive every year (at least until AT&T "surplussed" me). If you need to worry about double the number of pointers\, perhaps perl is not the implementation language you should be using. Qsort\, with its divide and conquer approach\, automatically ends up doing a lot of processing in the fastest (i.e. smallest) cache\, which is quite ironic\, since quicksort was invented when memory was monolithic\, and caches didn't exist. But I attempted to fix that flaw in merge sort (at the expense of some additional stack space... it took some discipline to ignore that in the environment I now program in rather than one where I learned to program :-). My attitude now (since you didn't ask :-) is that space matters a whole lot less than time\, and cache hits matter\, because they have a major impact on time. And maintainability matters a lot too\, because it relates to correctness\, which really matters more than just about everything\, which is why I wouldn't much mind qsort disappearing. I hope to live long enough to experience another major paradigm shift (like exploiting multiple processors in parallel).

On Fri\, Sep 6\, 2013 at 3:34 PM\, Victor Efimov \victor@vsespb\.ru wrote:

2013/9/6 John P. Linderman \jpl\.jpl@gmail\.com

The only sequences (that I could think of\, anyway) where it was clear that

qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

Seems memory usage can be better with qsort than with mergesort (not sure about perl implementation).

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only *n*/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort

The disadvantage of the simple version above is that it requires O(*n*) extra storage space\, which is as bad as merge sort\<http://en.wikipedia.org/wiki/Merge_sort>. The additional memory allocations required can also drastically impact speed and cache performance in practical implementations. There is a more complex version which uses an in-place\<http://en.wikipedia.org/wiki/In-place_algorithm>partition algorithm and can achieve the complete sort using O(log *n*) space (not counting the input) on average (for the call stack\<http://en.wikipedia.org/wiki/Call_stack>). We start with a partition function:

p5pRT commented 11 years ago

From @ap

* John P. Linderman \jpl\.jpl@gmail\.com [2013-09-07 00:35]:

I often have to remind myself that memory is now darn near free\, but my time is more expensive every year

Computer’s Perspective on Moore’s Law: Humans are getting more expensive at an exponential rate. —Mark S. Miller

-- Aristotle Pagaltzis // \<http://plasmasturm.org/>

p5pRT commented 11 years ago

From victor@vsespb.ru

I was able to reproduce case when qsort do 3 times less comparison than mergesort.

However I was unable to find case when qsort faster when comparsion function is simple\, obvious this can be the case for complex comparison functions. One also can/should use http://en.wikipedia.org/wiki/Schwartzian_transform anyway if comparison function is always slow\, an this limits benefits of qsort.

== use strict; use warnings;

my @a = (1..3) x 100_000; our $i;

sub mysub { ++$i; $a \<=> $b; }

my ($q\, $m); { use sort '_quicksort'; local $i=0; my @b = sort mysub @a; $q = $i; print "quicksort: $i\n"; }; { local $i=0; my @b = sort mysub @a; $m = $i; print "mergesort: $i\n"; }

print $q/$m\, "\n"; __END_ quicksort: 599997 mergesort: 1746356 0.343570841225958

2013/9/7 John P. Linderman \jpl\.jpl@gmail\.com

All true\, all valid observations. The merge sort implementation uses N additional *pointers* (not records)\, the qsort implementation sorts in place (unless you want stability\, in which case it\, too\, uses another set of pointers\, same as the merge sort implementation). I started programming back in the 60's\, when a few kilobytes of memory was all that there was\, and you'd do all sorts of things that would now be considered inexcusable to save a few bytes (like using only the last two digits of the year\, leading to the Y2K "crisis"). I often have to remind myself that memory is now darn near free\, but my time is more expensive every year (at least until AT&T "surplussed" me). If you need to worry about double the number of pointers\, perhaps perl is not the implementation language you should be using. Qsort\, with its divide and conquer approach\, automatically ends up doing a lot of processing in the fastest (i.e. smallest) cache\, which is quite ironic\, since quicksort was invented when memory was monolithic\, and caches didn't exist. But I attempted to fix that flaw in merge sort (at the expense of some additional stack space... it took some discipline to ignore that in the environment I now program in rather than one where I learned to program :-). My attitude now (since you didn't ask :-) is that space matters a whole lot less than time\, and cache hits matter\, because they have a major impact on time. And maintainability matters a lot too\, because it relates to correctness\, which really matters more than just about everything\, which is why I wouldn't much mind qsort disappearing. I hope to live long enough to experience another major paradigm shift (like exploiting multiple processors in parallel).

On Fri\, Sep 6\, 2013 at 3:34 PM\, Victor Efimov \victor@vsespb\.ru wrote:

2013/9/6 John P. Linderman \jpl\.jpl@gmail\.com

The only sequences (that I could think of\, anyway) where it was clear

that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

Seems memory usage can be better with qsort than with mergesort (not sure about perl implementation).

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only *n*/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort

The disadvantage of the simple version above is that it requires O(*n*) extra storage space\, which is as bad as merge sort\<http://en.wikipedia.org/wiki/Merge_sort>. The additional memory allocations required can also drastically impact speed and cache performance in practical implementations. There is a more complex version which uses an in-place\<http://en.wikipedia.org/wiki/In-place_algorithm>partition algorithm and can achieve the complete sort using O(log *n*) space (not counting the input) on average (for the call stack\<http://en.wikipedia.org/wiki/Call_stack>). We start with a partition function:

p5pRT commented 11 years ago

From @jplinderman

On Fri\, 06 Sep 2013 18:54:12\, Steffen Mueller \smueller@cpan\.org wrote:

+1 to see it go. I'm trying not to grandly suggest that somebody (not me\, likely not Nicholas) should try implementing timsort for us now instead of the current merge-alike. :)

I hope it's clear who copied whom. If not\, read on.

Anything indented following a "> " is text extracted from

http://svn.python.org/projects/python/trunk/Objects/listsort.txt

Tim Peters' description of "timsort". Let's clear the air right away. I'm not a fan of Tim Peters\, mostly because he acknowledges that one of the important ideas used in "timsort" originated in a paper by Peter Mcilroy:

I first learned about the galloping strategy in a related context; see:
"Adaptive Set Intersections\, Unions\, and Differences" $2000$
Erik D\. Demaine\, Alejandro López\-Ortiz\, J\. Ian Munro
and its followup(s). An earlier paper called the same strategy "exponential search":

"Optimistic Sorting and Information Theoretic Complexity" Peter McIlroy SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms)\, pp 467-474\, Austin\, Texas\, 25-27 January 1993.

and it probably dates back to an earlier paper by Bentley and Yao. The McIlroy paper in particular has good analysis of a mergesort that's probably strongly related to this one in its galloping strategy.

In http://svn.python.org/projects/python/trunk/Objects/listobject.c\, the closest thing I can find to the official source for "timsort"\, Peter's name is never mentioned. Nor does it appear in Wikipedia's entry for "timsort". That may or may not be Tim's fault.

It turns out that Perl is moving to a stable mergesort

Seems that Tim was aware of the sort code Jarkko and I were kicking about. I never attempted to hide the fact that all the good ideas came from Peter's code. Indeed\, we came within a gnat's eyelash of rejecting my implementation of Peter's algorithm because of licensing issues.

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-09/msg00027.html http://www.xray.mpe.mpg.de/cgi-bin/w3glimpse2html/perl5-porters/2000-09/msg00207.html?56#mfs

Anyway\, we have\, at the very start

Intro ----- This describes an adaptive\, stable\, natural mergesort\, modestly called timsort (hey\, I earned it \). It has supernatural performance on many kinds of partially ordered arrays (less than lg(N!) comparisons needed\, and as few as N-1)

\ doesn't cut the mustard. He borrowed some of the best parts of timsort from Peter's paper\, and\, to my taste\, doesn't adequately acknowledge that. OK\, air cleared. Let's move on\, with the understanding that I am not totally objective.

Since "Peter" can too easily be confused with "Peters"\, I'll refer to the sorts as Tim's and Peter's\, with the understanding that "Peter" refers to Peter Mcilroy.

I'm unaware of any algorithms whose performance is "supernatural". The best algorithms I know of are the product of careful analysis. Peter did that in the the Soda paper\, which\, I admit\, I mis-cited in the perl sort code\, but is accurately cited above. He further enhanced the algorithms\, and shared them with me\, and Doug Mcilroy\, at Bell Labs. I contributed only modestly to Peter's ideas (with suggestions like\, "stability is really worthwhile\, and here's how we can achieve it efficiently".) One of the improvements Peter made\, which is in the perl source code\, but not the Soda paper\, was how to detect "natural runs" without wasting a lot of comparisons. That deserves a paragraph or so of its own\, because it differs from the way that (I understand) "timsort" to work. I get my understanding of how "timsort" works from

http://code.google.com/p/timsort/source/browse/trunk/timSort.c?spec=svn17&r=17

(which I believe to have a few bugs\, but is at least comprehensible to someone fluent in C\, but not in python.) The source I cited above is too full of python-stuff to be worth my effort to grok.

So let's start with some simple combinatorics\, and let's assume that no element to be sorted is identical with something else to be sorted\, since that simplifies the analysis without doing too much damage to the truth. With a sequence of two (distinct) elements\, we have two possibilities. (Forgive me if this is dead obvious\, I'd like to make it clear to people to whom it is not dead obvious.)

1\,2 and 2\,1

in which case we either have a strictly increasing sequence or a strictly decreasing sequence (where we have chosen two integers whose relative values to stand in for the way in which two distinct objects compare to each other). So any two distinct values are either in strictly increasing or strictly decreasing order.

Add another (distinct) value\, 3\, to our two original values and we have

1\,2\,3 started increasing\, continued to be 1\,3\,2 started increasing\, then reversed 2\,1\,3 started decreasing\, then reversed 2\,3\,1 started increasing\, then reversed 3\,1\,2 started decreasing\, then reversed 3\,2\,1 started decreasing\, continued to be

In random (distinct) data\, there's only 1 chance in 3 that looking for an extension of the prevailing run will pay off. Not so good. But it gets worse. Of the sequences that have initial runs of 3

1\,2\,3\,4 started increasing\, continued to be 1\,2\,4\,3 started increasing\, then reversed 1\,3\,4\,2 started increasing\, then reversed 2\,3\,4\,1 started increasing\, then reversed 3\,2\,1\,4 started decreasing\, then reversed 4\,2\,1\,3 started decreasing\, then reversed 4\,3\,1\,2 started decreasing\, then reversed 4\,3\,2\,1 started decreasing\, continued to be

Now there's only 2 chances in 8 that the comparison will extend the run. Or\, of the N! ways N distinct elements can appear in a sequence\, only two will yield a run of N. Unless you have some reason to believe the data are "orderly"\, looking for runs longer than 2 is a sucker bet. Tim knows that

... timsort knows darned well it's doing compares that won't pay on random data ...

But\, of course\, if you want to detect runs longer than 2\, you must risk some comparisons that aren't going to pan out. That's what the "Optimistic" part of Peter's paper is about. But the devil is in the details. Timsort appears to bust large sequences into shorter ones of length between 32 and 64\, ostensibly to better balance the merge phase. Then it always tries to extend the initial run of 2\, in ascending or descending order\, to include as many additional elements as possible. If it extends beyond the fixed-length sequence\, it keeps on going\, possibly extending to the very end of the sequence to be sorted. If the run ends before the fixed-length segment is exhausted\, timsort reverses the run if it was strictly decreasing\, then uses a binary insertion sort to put the entire fixed-length sequence in sorted order. Two problems. As we observed (and Tim knows)\, runs as long as 32 to 64 are\, a priori\, spectacularly improbable. Unless there's a lot of structure to the input sequence\, the setup stage is going to end up "wasting" a comparison that doesn't extend a run\, then doing an insertion sort on what's left over. The binary insertion sort holds the number of comparisons to an acceptable level\, but the number of moves needed to insert additional elements into the sorted run can get ugly. Second\, and more important\, runs are detected only if they start at the beginning of a fixed-length segment. A nice long run might start somewhere else in the segment\, but it will be overlooked. In fact\, if you want to lay a world of hurt on timsort\, figure out a number of inputs that will lead to fixed length segments of size\, say\, 64\, take a strictly decreasing sequence of length 64\, move the last element to the front\, and fill up the input with repetitions of this sequence. Each fixed length sequence will start with an increasing sequence of length 2\, but the next (wasted) comparison will encounter a reversal\, so the decreasing sequence of length 62 will be insertion sorted. Insertion sort on decreasing elements is worst case for moves. Each element after the first one (which was\, by construction\, the minimal element) will have to moved each time a new element is added. OK\, 1800-some moves per segment isn't going to take terribly long\, but if the idea behind fixed length segments was to reduce moves in the merge phase\, you have started out in a large hole.

Peter's sort addressed both of these problems. (The run setup phase was not part of his Soda paper\, but it has been in pp_sort.c from the beginning\, because he came up with it while he was working at Bell Labs\, and sharing the source with me.) Rather than trying to extend every run of length 2\, we look at the sense of PTHRESH (defaulting to 8) pairs in a row. If they are all the same\, either increasing or strictly decreasing\, we have fairly compelling evidence of a run. Only then do we "risk" a comparison to see if adjacent runs can be "bridged" into a longer one. If all PTHRESH pairs turn out to form a run\, we have a run of 16\, a fine sign of orderly data. An attempt is made to extend the run\, one element at a time\, until the sense of a comparison is no longer compatible with extending the run (or until we have reached the end of the input). If we discover a reversal\, we keep whatever runs we got by bridging pairs\, and push ahead\, inspecting the sense of adjacent pairs. PTHRESH is a measure of "risk aversion" about "wasting" a comparison. Higher values demand more evidence of a possible long run before taking that risk\, but also overlook any runs shorter than 2*PTHRESH. The merge phase\, as described in the SODA paper (and as implemented by "galloping" in timsort) benefits from having adjacent runs in nearly sorted order\, (although they do not benefit as much as if the runs had previously been combined into a single run). So a conservative PTHRESH seems appropriate. I like to think about PTHRESH this way: in random data\, a pair is equally likely to be in increasing or strictly decreasing order. PTHRESH pairs in a row is like flipping heads (or tails) PTHRESH times in a row. That can happen with a fair coin\, and will happen about once every 2**(PTHRESH-1) times\, but even if we lose\, it cannot happen often enough to push us far from the theoretical limit on comparisons used. Note\, too\, that runs of length 2*PTHRESH or more need not start on any particular boundary. The same sequence that gives timsort heartburn presents no problems to Peter's sort. We get a run of length 2 at the start\, then discover the run of length 63 that follows it (lapping into timsort's next fixed length segment)\, then a bunch of runs of length 64 (each lapping into the next timsort segment)\, ending up with a run of length 63. We've gotten just one more sorted segment to deal with (the first one)\, and we got them with many fewer comparisons and moves. Of course\, one shouldn't use a single pathological input sequence to claim universal superiority\, but Peter's ability to find runs anywhere looks like a winner\, unless balanced merges can be shown to matter more. I await non-supernatural evidence.

I won't go into much detail about the merge phase\, which is really the essence of Peter's SODA paper. When merging two runs\, the basic idea is to determine which run has the greater element. We then find *all* the elements in the other run for which it is also greater\, and add them to the merged run. Now we have found a greater element in the opposite run\, and\, as before\, we use it as a "wedge" in the other array to find *all* the elements in the other array for which it is greater. And so on. The obvious way to "position the wedge" is to compare it to each element in turn\, and stop when we reach a greater one. If the input is random\, long runs are improbable\, much as long runs of adjacent pairs are improbable in the original setup. So if we haven't found a greater element after several comparisons\, we have evidence of order\, and we can justify skipping an element. If we find the following element is still not greater\, we know the element we skipped couldn't have been greater\, either. So we skip 2 elements\, then 4\, then 8\, and so on\, until we either find a greater element\, or we reach the end of the list. We refer to this as "ramping up" in pp_sort.c. If we encounter a greater element before reaching the end of the list\, we need to look back for the first such element. We go back half the size of the last step\, then half that size\, until we encounter an element that is not greater. We call this "ramping down". Then we turn around again and look in the other direction\, reducing the step size by half\, until we finally home in on the least element where we compared greater. Peter refers to this in the SODA paper as "insertion sort with differential search". Timsort refers to it as "galloping". As with taking a chance on extending a run in the initial setup\, this sometimes "wastes" a comparison. For example\, if we skip an element\, and find we are greater than the next element\, we have to check the element we skipped. If we are also greater than that element\, we would have found it immediately if we had just considered every element. So we don't want to start ramping up too soon. In Peter's sort\, this is parameter RTHRESH\, defaulting to 6. We check up to RTHRESH elements in a row before ramping up. That should avoid wasted compares on random data\, but go sub-linear on ordered data. This is a fixed threshold in Peter's sort\, an adjustable value in timsort. As far as I can tell\, this is a religious issue. Should previous behavior be predictive of future behavior? If you believe yes\, you should adjust the threshold up or down. If you think no\, just leave it fixed. Tim seems to agree it doesn't matter much. Pay particular attention to the last sentence.

We can't guess in advance when it's going to win\, though\, so we do one pair at a time until the evidence seems strong that galloping may pay. MIN_GALLOP is 7\, and that's pretty strong evidence. However\, if the data is random\, it simply will trigger galloping mode purely by luck every now and again\, and it's quite likely to hit one of the losing cases next. On the other hand\, in cases like ~sort\, galloping always pays\, and MIN_GALLOP is larger than it "should be" then. So the MergeState struct keeps a min_gallop variable that merge_lo and merge_hi adjust: the longer we stay in galloping mode\, the smaller min_gallop gets\, making it easier to transition back to galloping mode (if we ever leave it in the current merge\, and at the start of the next merge). But whenever the gallop loop doesn't pay\, min_gallop is increased by one\, making it harder to transition back to galloping mode (and again both within a merge and across merges). For random data\, this all but eliminates the gallop penalty: min_gallop grows large enough that we almost never get into galloping mode. And for cases like ~sort\, min_gallop can fall to as low as 1. This seems to work well\, but in all it's a minor improvement over using a fixed MIN_GALLOP value.

When using element w of one run as a "wedge" into the other run\, we cannot just use cmp(w\,b) \<= 0 as a means of determining whether w is less than element b from the other. That's ok if w comes from the earlier run\, and b from the later. But if w comes from the later run\, then\, to preserve stability over equal elements\, we must treat equal elements as though b\, the element from the earlier run\, is strictly less than w. In Peter's sort\, we do this with the simple expedient of setting integer "sense" to 0 if w comes from the earlier run\, and to -1 if it comes from the later run\, then doing all comparisons as

cmp(w\,b) \<= sense

This technique is one of a very few contributions I made as Peter and I worked on the source. I believe that timsort could be greatly simplified if it employed this technique\, rather than having separate (but closely related) routines for galloping left and galloping right or merging left and merging right. Maybe somebody should suggest it to Tim. He could call it timCmp and claim he earned it \. OK\, no more snarky comments.

My other contribution\, done without Peter watching over my shoulder to ensure I wasn't doing something stupid\, was to merge the initial set of runs using a divide-and-conquer approach. The code is extensively commented in pp_sort.c\, I won't repeat it here. It accomplishes two goals. It balances runs early\, while they are still short. And it revisits newly created runs as soon as balance allows\, so it should be more cache-friendly than the original "grand sweep" approach.

I don't pretend to know how much balanced run lengths contribute to performance. A huge run is\, in an important sense\, a celebration of the reduced number of comparisons it took to create it\, N-1 comparisons versus NlgN comparisons. If you blow a few comparisons (or\, much less importantly\, moves\, which are done in line)\, you are still probably well ahead. Maybe some computer science major can compare Peter's sort to Tim's sort. Let the best sort win! My money is on Peter's sort.

Here's a more accurate citation for Peter's sort\, prior to our working together. Thanks\, Tim! Much more useful than mine.

"Optimistic Sorting and Information Theoretic Complexity" Peter McIlroy SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms)\, pp 467-474\, Austin\, Texas\, 25-27 January 1993.

p5pRT commented 11 years ago

From @jplinderman

On Fri\, 06 Sep 2013 18:54:12\, Steffen Mueller \smueller@cpan\.org wrote:

+1 to see it go. I'm trying not to grandly suggest that somebody (not me\, likely not Nicholas) should try implementing timsort for us now instead of the current merge-alike. :)

My recent attempt to copy and paste a commentary on timsort appears to have ended up where the sun don't shine. I'll attach it\, knowing that attachments don't always show up well\, either.

Executive summary: I see no reason to believe that our sort\, based on code written (and analyzed) by Peter Mcilroy\, is not better than timsort. The (long) attachment explains how I came to that conclusion. If someone thinks otherwise\, the world at large would profit from a careful comparison. It won't be me\, either\, doing it (beyond the attached analysis).

p5pRT commented 11 years ago

From @jplinderman

timsort

p5pRT commented 11 years ago

From @nwc10

On Fri\, Sep 06\, 2013 at 02:59:44PM -0400\, John P. Linderman wrote:

My email address in pp_sort.c is shown as jpl@research.att.com. That will stop working in a year or so. Better would be jpl.jpl@gmail.com. Peter is

On Mon\, Sep 09\, 2013 at 10:55:33AM -0400\, John P. Linderman wrote:

Here's a more accurate citation for Peter's sort\, prior to our working together. Thanks\, Tim! Much more useful than mine.

"Optimistic Sorting and Information Theoretic Complexity" Peter McIlroy SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms)\, pp 467-474\, Austin\, Texas\, 25-27 January 1993.

So this is an appropriate change to make to pp_sort.c?

Inline Patch

```diff diff --git a/pp_sort.c b/pp_sort.c index b65e9eb..ab4a6fd 100644 --- a/pp_sort.c +++ b/pp_sort.c @@ -53,9 +53,15 @@ * The original code was written in conjunction with BSD Computer Software * Research Group at University of California, Berkeley. * - * See also: "Optimistic Merge Sort" (SODA '92) + * See also: "Optimistic Sorting and Information Theoretic Complexity" + * Peter McIlroy + * SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms), + * pp 467-474, Austin, Texas, 25-27 January 1993. * - * The integration to Perl is by John P. Linderman . + * The integration to Perl is by John P. Linderman . + * + * John described some more details of the implementation here: + * http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2013-09/msg00345.html * * The code can be distributed under the same terms as Perl itself. * Nicholas Clark ```

p5pRT commented 11 years ago

From @nwc10

On Fri\, Sep 06\, 2013 at 06:54:12PM +0200\, Steffen Mueller wrote:

On 09/06/2013 02:42 PM\, Nicholas Clark (via RT) wrote:

1) nothing will break if we removed _qsort.

... on CPAN.

Good point. Nothing on CPAN will break\, and probably nothing else.

but it's not exactly high maintenence code.

It's certainly not\, nut pp_sort.c as a whole is moderately complex. Reducing the amount of code there can only be good.

That seems to be the principal benefit. And the cost is either zero\, or pretty close. (This is one of those points where Google Code Search would have been useful)

Nicholas Clark

p5pRT commented 11 years ago

From @nwc10

On Fri\, Sep 06\, 2013 at 02:59:44PM -0400\, John P. Linderman wrote:

Part of the reason\, I think\, was that v5.6 and earlier's sort was a quicksort\, and hence not a stable sort.

Stability was definitely a feature\, but my best argument to Jarkko was that far too many common sequences (organ pipe\, sawtooth\, etc.) went quadratic\, and Peter Mcilroy's merge sort did a particularly nice job on sequences with pre-existing order (linear time on sorted input).

The only sequences (that I could think of\, anyway) where it was clear that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

The bit that keeps bugging me is "how many people know that their data is usually going to be in a particular form that qsort favours"?

In that\, the amount of use cases where it's correct to specify qsort instead of mergesort are small\, the amount of effort needed to determine this is high\, and the gain itself is small?

In which case\, seems that choosing a specific sort algorithm is not the first place to look at optimising.

Nicholas Clark

p5pRT commented 11 years ago

From @jplinderman

If you don't mind\, I'd prefer to leave off that final reference to my timsort analysis. You can't possibly get "more details of the implementation" than pp_sort.c itself\, and\, particularly if I do something about making instability an option\, you might get directed to something out of date. Which might also be true if Tim changes timsort. demerphq is nudging me to create a wikipedia page about perl's sort. *If* I do that\, it would make a better reference\, since it could be kept up to date with current implementations of both.

On Thu\, Sep 12\, 2013 at 8:41 AM\, Nicholas Clark \nick@ccl4\.org wrote:

On Fri\, Sep 06\, 2013 at 02:59:44PM -0400\, John P. Linderman wrote:

My email address in pp_sort.c is shown as jpl@research.att.com. That will stop working in a year or so. Better would be jpl.jpl@gmail.com. Peter is

On Mon\, Sep 09\, 2013 at 10:55:33AM -0400\, John P. Linderman wrote:

Here's a more accurate citation for Peter's sort\, prior to our working together. Thanks\, Tim! Much more useful than mine.

"Optimistic Sorting and Information Theoretic Complexity" Peter McIlroy SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms)\, pp 467-474\, Austin\, Texas\, 25-27 January 1993.

So this is an appropriate change to make to pp_sort.c?

diff --git a/pp_sort.c b/pp_sort.c index b65e9eb..ab4a6fd 100644 --- a/pp_sort.c +++ b/pp_sort.c @@ -53\,9 +53\,15 @@ * The original code was written in conjunction with BSD Computer Software * Research Group at University of California\, Berkeley. * - * See also: "Optimistic Merge Sort" (SODA '92) + * See also: "Optimistic Sorting and Information Theoretic Complexity" + * Peter McIlroy + * SODA (Fourth Annual ACM-SIAM Symposium on Discrete Algorithms)\, + * pp 467-474\, Austin\, Texas\, 25-27 January 1993. * - * The integration to Perl is by John P. Linderman \<jpl@research.att.com

. + * The integration to Perl is by John P. Linderman \jpl\.jpl@gmail\.com. + * + * John described some more details of the implementation here: + * http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2013-09/msg00345.html * * The code can be distributed under the same terms as Perl itself. *

Nicholas Clark

p5pRT commented 11 years ago

From @nwc10

On Thu\, Sep 12\, 2013 at 09:58:28AM -0400\, John P. Linderman wrote:

If you don't mind\, I'd prefer to leave off that final reference to my timsort analysis. You can't possibly get "more details of the implementation" than pp_sort.c itself\, and\, particularly if I do something about making instability an option\, you might get directed to something out of date. Which might also be true if Tim changes timsort. demerphq is nudging me to create a wikipedia page about perl's sort. *If* I do that\, it would make a better reference\, since it could be kept up to date with current implementations of both.

I just culled both lines. If a wikipedia page appears\, we can update pp_sort.c to point to it.

[And then the wikipedia talk folks can debate whether it's notable\, and if they so choose create a 404 error. Until we update the link to point to their history :-)]

Nicholas Clark

p5pRT commented 11 years ago

From @cpansprout

On Thu Sep 12 05:54:17 2013\, nicholas wrote:

On Fri\, Sep 06\, 2013 at 02:59:44PM -0400\, John P. Linderman wrote:

Part of the reason\, I think\, was that v5.6 and earlier's sort was a quicksort\, and hence not a stable sort.

Stability was definitely a feature\, but my best argument to Jarkko was that far too many common sequences (organ pipe\, sawtooth\, etc.) went quadratic\, and Peter Mcilroy's merge sort did a particularly nice job on sequences with pre-existing order (linear time on sorted input).

The only sequences (that I could think of\, anyway) where it was clear that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

The bit that keeps bugging me is "how many people know that their data is usually going to be in a particular form that qsort favours"?

In that\, the amount of use cases where it's correct to specify qsort instead of mergesort are small\, the amount of effort needed to determine this is high\, and the gain itself is small?

In which case\, seems that choosing a specific sort algorithm is not the first place to look at optimising.

But would it be possible for perl to detect datasets that work better with qsort? Or (most likely) would the time the detection takes outweigh the benefits?

--

Father Chrysostomos

p5pRT commented 11 years ago

From @jplinderman

I'm actually in complete agreement. I just wanted to acknowledge that perl's mergesort was not always better than perl's quicksort on a fairly extensive set of inputs. But if you know you are dealing with an input of the class I mentioned\, you'd be better off using a hash than with either sort. In fact\, I am so much in agreement about choosing algorithms being less meaningful than choosing that I gave the algorithm-choosing subpragmas funny names\, starting with an underscore\, and warned that they might not persist beyond 5.8. They had better longevity than I would have predicted. Obviously\, if qsort is deprecated\, so\, too\, must be the funny subpragmas.

A line in Peter's SODA paper has always intrigued me

Versus the fastest readily available qsort\, [a citation here to Jon Bentley's and Doug Mcilroy's "Engineering a Sort Function" paper\, which had been submitted\, but not yet accepted\, for publication] the hybrid [Peter's mergesort] is 25% slower on random integers; for random strings\, time differences are negligible.

Qsort is constrained to elements of fixed size\, and the elements themselves are moved about. As partitions get smaller and smaller\, the elements are brought closer and closer together\, "finding" faster and faster cache memory\, if it exists. Mergesort sorts pointers to elements; the elements themselves are never moved. Although the pointers will be packed together\, enjoying some of the benefits of whatever cache memory may exist\, the elements remain scattered\, probably leading to a cache miss when referenced. And the qsort comparison routine for comparing integers is very short and efficient\, so saving compares may not be so important. I figured maybe what Peter reported was a consequence of cache memory and simple comparisons. But if that were true for integers\, why wasn't it also true for strings? String comparison is more complicated than integer comparison. Saving comparisons becomes more important. Or maybe because integers are short\, and easy to move about\, and strings are (probably\, it's not stated) longer\, the extra time spent moving data counterbalances the advantages of cache memory. In any event\, the observed behavior had plausible explanations.

But now I wonder if there may not have been another contributing cause. Qsort\, unconstrained by stability\, is free to ignore the position of equal integers. Stable mergesort is not. As I mentioned elsewhere\, this can handicap mergesort when the input has many repeated items. If the "random integers" were selected from a pool significantly smaller than the total number of elements (we don't know)\, qsort would have another advantage. But if (lots of "ifs" piling up here) that factors into the performance difference Peter mentioned\, and the performance difference alluded to in pp_sort.c

The original merge sort\, in use since 5.7\, was as fast as\, or faster than\, qsort on many platforms\, but slower than qsort\, conspicuously so\, on others.

then the ability to ignore stability will take away that advantage of quicksort\, another reason why quicksort might be deprecated without much pain. Since I got lots of +1s for taking a look at ignoring stability (even though most of them came from one person :-)\, I'll do that. It looks like the changes to the merge phase will be quite simple\, but changes to the setup of initial runs will be anything but.

On Thu\, Sep 12\, 2013 at 8:53 AM\, Nicholas Clark \nick@ccl4\.org wrote:

On Fri\, Sep 06\, 2013 at 02:59:44PM -0400\, John P. Linderman wrote:

Part of the reason\, I think\, was that v5.6 and earlier's sort was a quicksort\, and hence not a stable sort.

Stability was definitely a feature\, but my best argument to Jarkko was that far too many common sequences (organ pipe\, sawtooth\, etc.) went quadratic\, and Peter Mcilroy's merge sort did a particularly nice job on sequences with pre-existing order (linear time on sorted input).

The only sequences (that I could think of\, anyway) where it was clear that qsort would outperform merge sort were those containing many repetitions of a small number (k) of keys. Qsort's "fat pivot" guaranteed that only k passes would be required\, where merge sort might need log N.

The bit that keeps bugging me is "how many people know that their data is usually going to be in a particular form that qsort favours"?

In that\, the amount of use cases where it's correct to specify qsort instead of mergesort are small\, the amount of effort needed to determine this is high\, and the gain itself is small?

In which case\, seems that choosing a specific sort algorithm is not the first place to look at optimising.

Nicholas Clark

p5pRT commented 6 years ago

From zefram@fysh.org

Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

-zefram

p5pRT commented 6 years ago

From @jplinderman

Zefram wrote:

Date: Mon\, 13 Nov 2017 02:44:28 +0000 Subject: Re: [perl #119635] deprecate and remove qsort? Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

I've been among those proposing that qsort may be taking more code-space than it's worth (and it was I who put the underscores in the subpragmata). I'll just point out that there are classes of inputs for which qsort may well outperform mergesort. If there are a small number of different sort keys\, like two-letter state abbreviations in the US\, in a large collection of records\, qsort will need something like log(#-of-different-keys) rather than log(#-of-records) passes. And it sorts in-place\, reducing storage requirements. Both of these advantages disappear if you demand both qsort and stability. So that combination almost surely should die. I was planning to replace the existing qsort code with the Bentley/McIlroy algorithm. \https://algs4.cs.princeton.edu/references/papers/bentley-mcilroy.pdf But I'm equally happy to scrap qsort altogether.

p5pRT commented 6 years ago

From @jplinderman

A couple days ago\, I sent a reply to Zefram's proposal to remove qsort. I haven't seen my reply in the porter's digest. Maybe I just missed it\, so I won't repeat the entire post. The important part was that there are classes of inputs for which qsort will outperform mergesort. If the number of distinct keys\, k\, is much smaller than the number of records\, r\, qsort will only need log(k) passes to mergesort's log(r). For example\, there are only 50-some states and territories in the US\, so sorting by state/territory will benefit from qsort\, 6 or 7 passes regardless of the number of records. Perhaps what is needed is a better subpragmata\, maybe something like "few_keys" rather than "_qsort"?

p5pRT commented 6 years ago

From @xsawyerx

On Sun\, 12 Nov 2017 18:44:44 -0800\, zefram@fysh.org wrote:

Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

Agreed.

p5pRT commented 6 years ago

From zefram@fysh.org

Done in commit e2091bb6ea87111c32936c9170405a44995be338.

-zefram

p5pRT commented 6 years ago

@xsawyerx - Status changed from 'open' to 'pending release'

p5pRT commented 6 years ago

From @avar

On Thu\, Nov 16 2017\, Sawyer X. via jotted:

On Sun\, 12 Nov 2017 18:44:44 -0800\, zefram@fysh.org wrote:

Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

Agreed.

All in all I don't mind that this got removed like this\, but I thought I'd point out that in the general case I think this sort of a removal is a bit cavalier given our current deprecation policy.

I.e.:

1. Obscure feature is discovered 2. There's grep of the CPAN discovering no uses for it 3. It's removed in one cycle\, and existing uses for it turn into an error if people upgrade Perl.

There's a big DarkPAN out there. E.g. Sawyer and I both work for a well-known perl company with a big codebase whose main money-making app won't compile with 5.28 because of this\, and that's some new code (C\<use sort '_quicksort'>) added in April of this year.

(Now\, in that particular case I have no idea why _quicksort was used\, and it can probably just be removed\, I've contacted the author).

I wonder if we should at least do this for now:

diff --git a/lib/sort.pm b/lib/sort.pm index 659f3e4f4d..c48e41cf90 100644 --- a/lib/sort.pm +++ b/lib/sort.pm @@ -24\,6 +24\,8 @@ sub import { $^H{sort} &= ~$sort::unstable_bit; } elsif ($_ eq 'defaults') { $^H{sort} = 0; + } elsif (/^_q(?:uick)?sort$/ or $_ eq '_mergesort') { + warn "sort: the $_ subpragma is a no-op since 5.28!"; } else { require Carp; Carp::croak("sort: unknown subpragma '$_'"); @@ -43\,6 +45\,8 @@ sub unimport { if ($_ eq 'stable') { $^H{sort} &= ~$sort::stable_bit; $^H{sort} |= $sort::unstable_bit; + } elsif (/^_q(?:uick)?sort$/ or $_ eq '_mergesort') { + warn "sort: the $_ subpragma is a no-op since 5.28!"; } else { require Carp; Carp::croak("sort: unknown subpragma '$_'");

Or maybe it really is better if this dies at compile-time.

FWIW I initially ran into this because I was hacking some pod docs that had to do with references to very old version (5.[68])\, hadn't rebased in a while\, and had this commit ready to push before I ran into a conflict with Zefram:

commit 1bb8507c8f Author: Ævar Arnfjörð Bjarmason \avar@cpan\.org Date: Wed Dec 6 13:26:32 2017 +0000

perlfunc: shorten overly verbose & out of date mention of sort.pm

This paragraph saying a 5.8 feature "may persist" in future versions is obviously wildly out of date as we have it now almost 16 years later. All of this is covered in the sort.pm documentation\, it's enough to just briefly mention that perl uses a stable mergesort\, and point to the other docs in case someone wants to use quicksort instead. --- pod/perlfunc.pod | 18 +++++------------- 1 file changed\, 5 insertions(+)\, 13 deletions(-)

diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index b7e001fcc8..1d117994c1 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -7307\,19 +7307\,11 @@ L\<C\|/grep BLOCK LIST>) actually modifies the element in the original list. This is usually something to be avoided when writing clear code.

-Perl 5.6 and earlier used a quicksort algorithm to implement sort. -That algorithm was not stable and I\ go quadratic. (A I\ sort -preserves the input order of elements that compare equal. Although -quicksort's run time is O(NlogN) when averaged over all arrays of -length N\, the time can be O(N**2)\, I\ behavior\, for some -inputs.) In 5.7\, the quicksort implementation was replaced with -a stable mergesort algorithm whose worst-case behavior is O(NlogN). -But benchmarks indicated that for some inputs\, on some platforms\, -the original quicksort was faster. 5.8 has a L\ pragma for -limited control of the sort. Its rather blunt control of the -underlying algorithm may not persist into future Perls\, but the -ability to characterize the input or output in implementation -independent ways quite probably will. +Perl uses a I\ mergesort algorithm whose worst-case behavior is +O(NlogN). Benchmarks indicate that for some inputs\, on some platforms\, +quicksort can be faster\, although it can have much worse performance +for pathological inputs. See the L\ module for how the sort +algorithm can be tweaked with L\<a pragma|perlpragma>.

I.e. my reading of these docs was that the 5.8 reference was simply some now-obsolete way of saying "we have this new feature now"\, not that the docs had serious about deprecating this\, after RT #119635 to remove it didn't even get filed until 5.18 had been released!

p5pRT commented 6 years ago

From zefram@fysh.org

Avar Arnfjorth Bjarmason wrote:

I wonder if we should at least do this for now: ... + } elsif (/^_q(?:uick)?sort$/ or $_ eq '_mergesort') { + warn "sort: the $_ subpragma is a no-op since 5.28!";

There's no real need for that: the error that it'll produce leads one straight to the problem\, and the change required to get it running again is very straightforward. But if we were to make these subpragmata legal again\, it shouldn't just be in the open-ended way that you suggest. We don't want to have that baggage around forever. We should deprecate them\, with a defined deletion date\, and on the deletion date they should become illegal (as they are now in blead). The immediate implication is that\, rather than an unconditional warning of the type you show\, they should elicit a categorised deprecation warning that advertises the deletion date.

-zefram

p5pRT commented 6 years ago

From @jplinderman

I'll admit that I was a bit disappointed that my response to Zefram's proposal to pull the plug on qsort somehow never appeared in the digest\, and that subsequent comments concerning qsort appeared in the digest\, but the full text of those comments didn't appear in my version of the digest\, so I couldn't reply to them either. It doesn't matter much now\, but I wonder if other porters see similar problems.

I spoke with Jon Bentley last week. He confirmed my observation that if N records have only K distinct keys\, qsort can be made to behave like NlogK rather than NlogN. If you are sorting on something like month (K==12) or states in the US (K==50)\, that can be a significant improvement when N is large. After a bit of thought\, I realized that qsort can be stabilized by allocating another copy of the array pointers *without* treating all keys as being distinct (as the current stabilized implementation\, of which I am guilty\, does\, losing the logK vs logN advantage).

It would be a shame to deny users the performance gains that qsort makes possible in certain circumstances. But the subpragmata for achieving those gains (guilty again) were butt-ugly. Users shouldn't be mentioning algorithms\, the details of which are subject to change anyway. On the other hand\, it is entirely reasonable that users be able to express behavior\, like stability\, which may or may not be important to them. We already have subpragmata for stability. Users should be able to describe properties of the data of which they may be aware (and which the sort code cannot determine)\, like estimates of the number of distinct keys. Something logarithmic like 10KEYS\, 100KEYS\, 1000KEYS etc could express estimates. If we want to do the best we can for the users\, the discussion should be about good\, subpragmatic\, ways to express qualities of the data\, and let the sort code figure out the best available algorithms.

On Wed\, Dec 6\, 2017 at 9:23 AM\, Ævar Arnfjörð Bjarmason \avarab@gmail\.com wrote:

On Thu\, Nov 16 2017\, Sawyer X. via jotted:

On Sun\, 12 Nov 2017 18:44:44 -0800\, zefram@fysh.org wrote:

Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

Agreed.

All in all I don't mind that this got removed like this\, but I thought I'd point out that in the general case I think this sort of a removal is a bit cavalier given our current deprecation policy.

I.e.:

1. Obscure feature is discovered 2. There's grep of the CPAN discovering no uses for it 3. It's removed in one cycle\, and existing uses for it turn into an error if people upgrade Perl.

There's a big DarkPAN out there. E.g. Sawyer and I both work for a well-known perl company with a big codebase whose main money-making app won't compile with 5.28 because of this\, and that's some new code (C\<use sort '_quicksort'>) added in April of this year.

(Now\, in that particular case I have no idea why _quicksort was used\, and it can probably just be removed\, I've contacted the author).

I wonder if we should at least do this for now:
diff \-\-git a/lib/sort\.pm b/lib/sort\.pm
index 659f3e4f4d\.\.c48e41cf90 100644
\-\-\- a/lib/sort\.pm
\+\+\+ b/lib/sort\.pm
@&#8203;@&#8203; \-24\,6 \+24\,8 @&#8203;@&#8203; sub import \{
            $^H\{sort\} &= ~$sort&#8203;::unstable\_bit;
        \} elsif $$\_ eq 'defaults'$ \{
            $^H\{sort\} =   0;
\+       \} elsif $/^\_q\(?&#8203;:uick$?sort$/ or $\_ eq '\_mergesort'\) \{
\+           warn "sort&#8203;: the $\_ subpragma is a no\-op since 5\.28\!";
        \} else \{
            require Carp;
            Carp&#8203;::croak$"sort&#8203;: unknown subpragma '$\_'"$;
@&#8203;@&#8203; \-43\,6 \+45\,8 @&#8203;@&#8203; sub unimport \{
        if $$\_ eq 'stable'$ \{
            $^H\{sort\} &= ~$sort&#8203;::stable\_bit;
            $^H\{sort\} |=  $sort&#8203;::unstable\_bit;
\+       \} elsif $/^\_q\(?&#8203;:uick$?sort$/ or $\_ eq '\_mergesort'\) \{
\+           warn "sort&#8203;: the $\_ subpragma is a no\-op since 5\.28\!";
        \} else \{
            require Carp;
            Carp&#8203;::croak$"sort&#8203;: unknown subpragma '$\_'"$;
Or maybe it really is better if this dies at compile-time.

FWIW I initially ran into this because I was hacking some pod docs that had to do with references to very old version (5.[68])\, hadn't rebased in a while\, and had this commit ready to push before I ran into a conflict with Zefram:
commit 1bb8507c8f
Author&#8203;: Ævar Arnfjörð Bjarmason \<avar@&#8203;cpan\.org>
Date&#8203;:   Wed Dec 6 13&#8203;:26&#8203;:32 2017 \+0000

    perlfunc&#8203;: shorten overly verbose & out of date mention of sort\.pm

    This paragraph saying a 5\.8 feature "may persist" in future
versions is obviously wildly out of date as we have it now almost 16 years later. All of this is covered in the sort.pm documentation\, it's enough to just briefly mention that perl uses a stable mergesort\, and point to the other docs in case someone wants to use quicksort instead. --- pod/perlfunc.pod | 18 +++++------------- 1 file changed\, 5 insertions(+)\, 13 deletions(-)
diff \-\-git a/pod/perlfunc\.pod b/pod/perlfunc\.pod
index b7e001fcc8\.\.1d117994c1 100644
\-\-\- a/pod/perlfunc\.pod
\+\+\+ b/pod/perlfunc\.pod
@&#8203;@&#8203; \-7307\,19 \+7307\,11 @&#8203;@&#8203; L\<C\<grep>|/grep BLOCK LIST>\)
 actually modifies the element in the original list\.  This is usually
 something to be avoided when writing clear code\.

\-Perl 5\.6 and earlier used a quicksort algorithm to implement sort\.
\-That algorithm was not stable and I\<could> go quadratic\.  \(A
I\ sort -preserves the input order of elements that compare equal. Although -quicksort's run time is O(NlogN) when averaged over all arrays of -length N\, the time can be O(N**2)\, I\ behavior\, for some -inputs.) In 5.7\, the quicksort implementation was replaced with -a stable mergesort algorithm whose worst-case behavior is O(NlogN). -But benchmarks indicated that for some inputs\, on some platforms\, -the original quicksort was faster. 5.8 has a L\ pragma for -limited control of the sort. Its rather blunt control of the -underlying algorithm may not persist into future Perls\, but the -ability to characterize the input or output in implementation -independent ways quite probably will. +Perl uses a I\ mergesort algorithm whose worst-case behavior is +O(NlogN). Benchmarks indicate that for some inputs\, on some platforms\, +quicksort can be faster\, although it can have much worse performance +for pathological inputs. See the L\ module for how the sort +algorithm can be tweaked with L\<a pragma|perlpragma>.

I.e. my reading of these docs was that the 5.8 reference was simply some now-obsolete way of saying "we have this new feature now"\, not that the docs had serious about deprecating this\, after RT #119635 to remove it didn't even get filed until 5.18 had been released!

p5pRT commented 6 years ago

From @jplinderman

ARRGH! My response again disappeared from the digest I just received. Here's how my copy ends\, omitting my reply\, whether of not I click on the View entire message link. Massively annoying!

I.e. my reading of these docs was that the 5.8 reference was simply some now-obsolete way of saying "we have this new feature now"\, not that the docs had serious about deprecating this\, after RT #119635 to remove it didn't even get filed until 5.18 had been released!

...

[Message clipped] View entire message \https://mail.google.com/mail/u/0/?ui=2&ik=96a8b2a39a&view=lg&msg=1602c7aad545e316

On Wed\, Dec 6\, 2017 at 10:37 AM\, John P. Linderman \jpl\.jpl@gmail\.com wrote:

I'll admit that I was a bit disappointed that my response to Zefram's proposal to pull the plug on qsort somehow never appeared in the digest\, and that subsequent comments concerning qsort appeared in the digest\, but the full text of those comments didn't appear in my version of the digest\, so I couldn't reply to them either. It doesn't matter much now\, but I wonder if other porters see similar problems.

I spoke with Jon Bentley last week. He confirmed my observation that if N records have only K distinct keys\, qsort can be made to behave like NlogK rather than NlogN. If you are sorting on something like month (K==12) or states in the US (K==50)\, that can be a significant improvement when N is large. After a bit of thought\, I realized that qsort can be stabilized by allocating another copy of the array pointers *without* treating all keys as being distinct (as the current stabilized implementation\, of which I am guilty\, does\, losing the logK vs logN advantage).

It would be a shame to deny users the performance gains that qsort makes possible in certain circumstances. But the subpragmata for achieving those gains (guilty again) were butt-ugly. Users shouldn't be mentioning algorithms\, the details of which are subject to change anyway. On the other hand\, it is entirely reasonable that users be able to express behavior\, like stability\, which may or may not be important to them. We already have subpragmata for stability. Users should be able to describe properties of the data of which they may be aware (and which the sort code cannot determine)\, like estimates of the number of distinct keys. Something logarithmic like 10KEYS\, 100KEYS\, 1000KEYS etc could express estimates. If we want to do the best we can for the users\, the discussion should be about good\, subpragmatic\, ways to express qualities of the data\, and let the sort code figure out the best available algorithms.

On Wed\, Dec 6\, 2017 at 9:23 AM\, Ævar Arnfjörð Bjarmason \avarab@gmail\.com wrote:
On Thu\, Nov 16 2017\, Sawyer X. via jotted:

On Sun\, 12 Nov 2017 18:44:44 -0800\, zefram@fysh.org wrote:

Discussion on this petered out\, having mostly got sidetracked into timsort. A couple of people were in favour of removing qsort and the algorithm-specific pragmata\, and no one opposed. Adding my opinion\, I too think we should remove these things. Pragmata to specify whether stability is required make sense; choosing specific algorithms does not. I also think\, given the underscore spelling and the documented status of thesse subpragmata\, that we do not need a deprecation cycle to remove them.

If no one objects\, I intend to go ahead with removal.

Agreed.

All in all I don't mind that this got removed like this\, but I thought I'd point out that in the general case I think this sort of a removal is a bit cavalier given our current deprecation policy.

I.e.:

1. Obscure feature is discovered 2. There's grep of the CPAN discovering no uses for it 3. It's removed in one cycle\, and existing uses for it turn into an error if people upgrade Perl.

There's a big DarkPAN out there. E.g. Sawyer and I both work for a well-known perl company with a big codebase whose main money-making app won't compile with 5.28 because of this\, and that's some new code (C\<use sort '_quicksort'>) added in April of this year.

(Now\, in that particular case I have no idea why _quicksort was used\, and it can probably just be removed\, I've contacted the author).

I wonder if we should at least do this for now:
diff \-\-git a/lib/sort\.pm b/lib/sort\.pm
index 659f3e4f4d\.\.c48e41cf90 100644
\-\-\- a/lib/sort\.pm
\+\+\+ b/lib/sort\.pm
@&#8203;@&#8203; \-24\,6 \+24\,8 @&#8203;@&#8203; sub import \{
            $^H\{sort\} &= ~$sort&#8203;::unstable\_bit;
        \} elsif $$\_ eq 'defaults'$ \{
            $^H\{sort\} =   0;
\+       \} elsif $/^\_q\(?&#8203;:uick$?sort$/ or $\_ eq '\_mergesort'\) \{
\+           warn "sort&#8203;: the $\_ subpragma is a no\-op since 5\.28\!";
        \} else \{
            require Carp;
            Carp&#8203;::croak$"sort&#8203;: unknown subpragma '$\_'"$;
@&#8203;@&#8203; \-43\,6 \+45\,8 @&#8203;@&#8203; sub unimport \{
        if $$\_ eq 'stable'$ \{
            $^H\{sort\} &= ~$sort&#8203;::stable\_bit;
            $^H\{sort\} |=  $sort&#8203;::unstable\_bit;
\+       \} elsif $/^\_q\(?&#8203;:uick$?sort$/ or $\_ eq '\_mergesort'\) \{
\+           warn "sort&#8203;: the $\_ subpragma is a no\-op since 5\.28\!";
        \} else \{
            require Carp;
            Carp&#8203;::croak$"sort&#8203;: unknown subpragma '$\_'"$;
Or maybe it really is better if this dies at compile-time.

FWIW I initially ran into this because I was hacking some pod docs that had to do with references to very old version (5.[68])\, hadn't rebased in a while\, and had this commit ready to push before I ran into a conflict with Zefram:
commit 1bb8507c8f
Author&#8203;: Ævar Arnfjörð Bjarmason \<avar@&#8203;cpan\.org>
Date&#8203;:   Wed Dec 6 13&#8203;:26&#8203;:32 2017 \+0000

    perlfunc&#8203;: shorten overly verbose & out of date mention of sort\.pm

    This paragraph saying a 5\.8 feature "may persist" in future
versions is obviously wildly out of date as we have it now almost 16 years later. All of this is covered in the sort.pm documentation\, it's enough to just briefly mention that perl uses a stable mergesort\, and point to the other docs in case someone wants to use quicksort instead. --- pod/perlfunc.pod | 18 +++++------------- 1 file changed\, 5 insertions(+)\, 13 deletions(-)
diff \-\-git a/pod/perlfunc\.pod b/pod/perlfunc\.pod
index b7e001fcc8\.\.1d117994c1 100644
\-\-\- a/pod/perlfunc\.pod
\+\+\+ b/pod/perlfunc\.pod
@&#8203;@&#8203; \-7307\,19 \+7307\,11 @&#8203;@&#8203; L\<C\<grep>|/grep BLOCK LIST>\)
 actually modifies the element in the original list\.  This is usually
 something to be avoided when writing clear code\.

\-Perl 5\.6 and earlier used a quicksort algorithm to implement sort\.
\-That algorithm was not stable and I\<could> go quadratic\.  \(A
I\ sort -preserves the input order of elements that compare equal. Although -quicksort's run time is O(NlogN) when averaged over all arrays of -length N\, the time can be O(N**2)\, I\ behavior\, for some -inputs.) In 5.7\, the quicksort implementation was replaced with -a stable mergesort algorithm whose worst-case behavior is O(NlogN). -But benchmarks indicated that for some inputs\, on some platforms\, -the original quicksort was faster. 5.8 has a L\ pragma for -limited control of the sort. Its rather blunt control of the -underlying algorithm may not persist into future Perls\, but the -ability to characterize the input or output in implementation -independent ways quite probably will. +Perl uses a I\ mergesort algorithm whose worst-case behavior is +O(NlogN). Benchmarks indicate that for some inputs\, on some platforms\, +quicksort can be faster\, although it can have much worse performance +for pathological inputs. See the L\ module for how the sort +algorithm can be tweaked with L\<a pragma|perlpragma>.

I.e. my reading of these docs was that the 5.8 reference was simply some now-obsolete way of saying "we have this new feature now"\, not that the docs had serious about deprecating this\, after RT #119635 to remove it didn't even get filed until 5.18 had been released!

p5pRT commented 6 years ago

From @xsawyerx

On 12/06/2017 04:51 PM\, Zefram wrote:

Avar Arnfjorth Bjarmason wrote:

I wonder if we should at least do this for now: ... + } elsif (/^_q(?:uick)?sort$/ or $_ eq '_mergesort') { + warn "sort: the $_ subpragma is a no-op since 5.28!"; There's no real need for that: the error that it'll produce leads one straight to the problem\, and the change required to get it running again is very straightforward. But if we were to make these subpragmata legal again\, it shouldn't just be in the open-ended way that you suggest. We don't want to have that baggage around forever. We should deprecate them\, with a defined deletion date\, and on the deletion date they should become illegal (as they are now in blead). The immediate implication is that\, rather than an unconditional warning of the type you show\, they should elicit a categorised deprecation warning that advertises the deletion date.

Is this a price too high to pay for deprecating this over time?

In other words\, why avoid deprecating this cleanly as we usually do?

p5pRT commented 6 years ago

From zefram@fysh.org

Sawyer X wrote:

In other words\, why avoid deprecating this cleanly as we usually do?

Because it's never been a fully supported feature. It has never incurred the compatibility obligations that stable features do. With that constraint not applying\, we get a maintainability win from being rid of the code quickly.

-zefram

p5pRT commented 6 years ago

From @xsawyerx

On 12/18/2017 01:18 PM\, Zefram wrote:

Sawyer X wrote:

In other words\, why avoid deprecating this cleanly as we usually do? Because it's never been a fully supported feature. It has never incurred the compatibility obligations that stable features do.

Good reason.

p5pRT commented 6 years ago

From @jplinderman

> On 12/18/2017 01:18 PM\, Zefram wrote: >

Sawyer X wrote: >

In other words\, why avoid deprecating this cleanly as we usually do? > Because it's never been a fully supported feature. It has never incurred the compatibility obligations that stable features do.

Long term\, the _qsort subpragma surely deserves to die. It asks the users to understand the innards of pp_sort.c\, which they almost certainly won't\, and which are\, in any event\, subject to change. But I have long been puzzled by the observation in perldoc -f sort that

... benchmarks indicated that for some inputs\, on some platforms\, the original quicksort was faster.

I'm now convinced that this had much less to do with platform than with the inputs\, and that

...the ability to characterize the input or output in implementation independent ways

is a worthy goal for the sort pragma. Many of us would consider a 10% speedup in something as common as sorting as worth some maintainability loss\, and I believe there will be inputs for which quicksort will be multiple times faster than mergesort. I'm trying to instrument a version of pp_sort.c where quicksort is still present to replace "I believe" with "I can show". In a similar vein\, "I believe" that the Bentley/McIlroy quicksort

https://algs4.cs.princeton.edu/references/papers/bentley-mcilroy.pdf

with its sophisticated selection of pivot elements\, will do an even better job on such inputs. My current goal is to instrument a version of pp_sort.c with *both* quicksort implementations\, and with mergesort implementations with and without stability enforced\, and then beat the daylights out all options to prove correctness and get an objective measure of performance. I'll then strip out the losing implementations\, and try to integrate it with the evolving pp_sort.c. If it's decided to discard whichever quicksort algorithm wins out\, that should be easy. If it looks like it's worth keeping\, we can discuss how "characterize the input" in ways that allow pp_sort.c to select the appropriate algorithm.

p5pRT commented 6 years ago

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release yesterday of Perl 5.28.0\, this and 185 other issues have been resolved.

Perl 5.28.0 may be downloaded via: https://metacpan.org/release/XSAWYERX/perl-5.28.0

If you find that the problem persists\, feel free to reopen this ticket.

p5pRT commented 6 years ago

@khwilliamson - Status changed from 'pending release' to 'resolved'

Perl / perl5

deprecate and remove qsort? #13234

From @nwc10

From @shlomif

From @tsee

From @jplinderman

From victor@vsespb.ru

http​://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only *n*/2 extra spaces).

http​://en.wikipedia.org/wiki/Quicksort

From @jplinderman

http​://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only *n*/2 extra spaces).

http​://en.wikipedia.org/wiki/Quicksort

From @ap

From victor@vsespb.ru

print $q/$m\, "\n"; __END_ quicksort​: 599997 mergesort​: 1746356 0.343570841225958

http​://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only *n*/2 extra spaces).

http​://en.wikipedia.org/wiki/Quicksort

From @jplinderman

From @jplinderman

From @jplinderman

From @nwc10

From @nwc10

From @nwc10

From @jplinderman

From @nwc10

From @cpansprout

From @jplinderman

From zefram@fysh.org

From @jplinderman

From @jplinderman

From @xsawyerx

From zefram@fysh.org

From @avar

From zefram@fysh.org

From @jplinderman

From @jplinderman

From @xsawyerx

From zefram@fysh.org

From @xsawyerx

From @jplinderman

From @khwilliamson

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only n/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only n/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort

print $q/$m\, "\n"; __END_ quicksort: 599997 mergesort: 1746356 0.343570841225958

http://en.wikipedia.org/wiki/Merge_sort

Merge sort's most common implementation does not sort in place; therefore\, the memory size of the input must be allocated for the sorted output to be stored in (see below for versions that need only n/2 extra spaces).

http://en.wikipedia.org/wiki/Quicksort