rust-lang / rfcs

RFCs for changes to Rust
https://rust-lang.github.io/rfcs/
Apache License 2.0
5.93k stars 1.57k forks source link

Add non-allocating sort function to libcore #790

Closed steveklabnik closed 7 years ago

steveklabnik commented 9 years ago

Issue by mahkoh Saturday Nov 22, 2014 at 22:45 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/19221

This issue was labelled with: A-collections, A-libs, I-enhancement in the Rust repository


Morwenn commented 8 years ago

I have no experience with Rust, but I am currently maintaining a C++ sorting library. I couldn't help but notice this issue about sorting. My library includes adapted versions of the GrailSort and WikiSort mentioned in the previous issue, which are the only O(n log n) in-place stable algorithms I know that can work without allocating memory (even though allocating a fixed buffer is better for both of them).

You know how it is: benchmarks are full of surprises, and I couldn't reproduce the patterns for which GrailSort is better. In all my test cases, WikiSort is always better than GrailSort (it might come from my C++ port of GrailSort though), even when bigger buffers are given to GrailSort for the merge operations.

Now, as mentioned in the previous issue, WikiSort is complex. However, it can be made a bit simpler: I managed to replace some of the operations with some algorithms from the C++ standard library (I don't know whether Rust has such algorithms), the embedded sort for 3 values can be replaced by a call to a vanilla insertion sort and IME sorting networks are only worth it with integers and when carefully implemented. Those in WikiSort are even more complex to maintain the stability. I didn't try it yet, but I believe that all the sorting networks part can be replaced by an insertion sort too. Moreover, if the plan is to use no additional memory at all, even a fixed-sized buffer, the algorithm can even be simplified a bit.

That said, even with those modifications, I think that it's still around ~600 LOC and it really loses a bit of performance when it doesn't have even a fixed-size buffer. If you ever decide that you don't care about stability, or choose to provide separate stable and unstable sorting algorithms, then the pattern-defeating quicksort is the best in-place sorting algorithm I know of that does not use any additional memory, and it's around ~300 LOC. It's generally faster than introsort and may make its way into libc++ and libstdc++.

Update: a new version of pattern-defeating quicksort beats branch prediction and can subsequently be ~twice as fast as introsort, but the algorithm is more complex (borrowing ideas from BlockQuicksort: How Branch Mispredictions don't affect Quicksort).

Hope that will help you to choose the appropriate sorting algorithm :)

pczarn commented 8 years ago

The algorithm should make sure destructors run once if the comparator panics. The current algorithm in libcollections accounts for this.

ahicks92 commented 7 years ago

I independently came to the conclusion that it would be nice to be able to do this. Is this something that is still open for discussion, and does anyone know why the current algorithm was chosen?

sfackler commented 7 years ago

IIRC we wanted something stable and without an n^2 worst case.

keeperofdakeys commented 7 years ago

The grailsort and wikisort algorithms talked about in the previous issue seem to match those criteria.

pczarn commented 7 years ago

The pattern-defeating quicksort ~matches those criteria as well~. I have a mostly complete implementation, but it's unpublished.

Morwenn commented 7 years ago

It isn't stable though.

burdges commented 7 years ago

CS.SE Q&A here with good background.

ahicks92 commented 7 years ago

If we can't do it as sort_xxx, we could do it as sort_unstable_xxx or sort_noalloc_xxx.

Stability is obviously important, in both the sorting and backwards compatibility senses. But there's lots of things for which you don't care about stability, and our current implementations will grab a lot of ram for large arrays (at least, per the documentation). You could not feasibly sort a 1 GB vector with stdlib, unless you're operating on an atypically powerful machine.

If you do and it fails, it panics; you can't even catch the error. If we can't add an allocating one, we should at least add one which returns Result or something, so that you can do something appropriate besides completely crash.

Perhaps allocating is not actually as big a deal as I think, but adding these methods to stdlib seems to have very little long-term costs. Obviously, the best approach is sort, etc, not allocating.

ghost commented 7 years ago

To put some numbers into perspective, I did my best at implementing both stable and unstable sorts in Rust.

The stable sort is a variant of timsort, which recently got merged into libstd. The unstable sort is pattern-defeating quicksort, which I published as a standalone crate.

The differences in performance are pretty interesting.

ghost commented 7 years ago

Non-allocating slice::sort_unstable is now in libcore and will be stable in Rust 1.20.

Tracking issue