rust-ndarray / ndarray-stats

Statistical routines for ndarray
https://docs.rs/ndarray-stats
Apache License 2.0
201 stars 25 forks source link

Implement Yaroslavskiy-Bentley-Bloch Quicksort. #80

Closed n3vu0r closed 1 year ago

n3vu0r commented 3 years ago

This is a dual pivot 3-way quick sort which is less likely to run into worst cases which can end in stack overflows for large arrays.

The previous single pivot 2-way quick sort used by the FreedmanDiaconis strategy overflows its stack with this NPY array.

n3vu0r commented 3 years ago

I will test also for same elements and refactor this PR as soon as I have time.

n3vu0r commented 3 years ago

Since we keep the single-pivot partitioning method, I've implemented a variant of Sesquickselect. It suggests the optimal pivot selection from fixed size samples wrt to the relative sought rank index / length and also switches from dual-pivot to single-pivot partitioning for extreme sought ranks (page 17, figure 3). The benches show speed up with adaptive pivot sampling and with smaller recursion cutoff thresholds, at least on my machine. For the bulk version, I kept the recommended skewed pivots for Quicksort in my assumption that multiple indexes change the characteristics from Quickselect towards Quicksort (and there is no single sought rank).

It works well with equal element arrays and with sorted and reversely sorted arrays. Tested up to 1_000_000 elements.

I would suggest to make sample_mut() generic over the sample size via const_generics if an MSRV of 1.51 is fine?

The sampling does not have to be equally spaced, could also be randomized. I have no favorite yet.

n3vu0r commented 3 years ago

I used adaptive pivot sampling for the bulk version as well but only for branches with a single index remaining. I dunno how the benches are configured but would like to add larger input arrays.

n3vu0r commented 1 year ago

Fixes #86 by letting the stack grow on heap whenever necessary. Using dual pivoting should already be superior compared with just splitting => pivot into pivot == and > pivot or do I miss something?

n3vu0r commented 1 year ago

This dynamic stack grow on heap should in general be used for every recursive implementation whose depth is depending on input data. But we should still try to avoid worst case complexities. I found that the bulk version's single-pivoting branch is the problem. I will try to split => pivot here too like its done in the non-bulk version. If this doesn't work out, the single-pivoting branch of the bulk version should be removed.

Simply removing that branch, reduces runtime of the test for large arrays of equal elements from 700 ms to 50 ms.

n3vu0r commented 1 year ago

Reusing non-bulk version for single-index invocation of bulk version does the job even better and reduces code complexity. I've increased the array length by factor 10 in the test and execution time was unchanged.

n3vu0r commented 1 year ago

Ready for review/merge.

n3vu0r commented 1 year ago

Closing in favor of #92.