Open vigna opened 1 year ago
Hello, sadly, it is not possible, because of the orphan rule
I could implement the sort for (usize, usize)
, but it would mean I couldn't sort (i32, f32)
for example, and the orphan rule
prevents me from implementing the sort for a generic (T, U)
tuple.
What I recommend is to sort Struct { a: usize, b: usize }
or u128
, however, I haven't done a lot of (performance) tests with u128
. Since it is a logical type (it is a primitive type in Rust but not for the machine) it is pretty slow, but still faster than other sorts.
I don't know if you need a single thread or a multithreaded sort, if it is for single thread, I can add a "dedicated sort" in the folder with the same name. For multithreaded sort, it is a bit more complicated.
MMhh actually, I should add a couple of dedicated single thread sort for tuple which cover most of the cases.
Oh we're totally ok with a struct. It doesn't have to be a tuple. It's just that the RadixKey trait is not really documented so we're a bit confused as what we should implement. If you can shows us an example with a struct containing two usize that would be great!
Mmmmmh I just realized it is a bit more tricky than what I was thinking.
More tricky or impossible? :)
Hello sorry for being so late, I completely forgot to do it.
At first I wanted to do the generic case to sort (for example) an array of primitive type vec<vec<usize>>
but it is more tricky than what I was expecting
So I just did your need:
You can find the code here:
https://github.com/lakwet/voracious_sort/blob/c46c6b1d1df404a30b93a6895880415b5cfe0080/src/types/custom.rs#L343
(you can just copy paste the code from line 343 to the end of the file)
And the benchmark with correctness test (there is a horizontal scroll):
running 1 test
Number of iterations: 3
Number of threads: 4
With check: true
=== Test struct{usize, usize} === Trait Vora Uns Trait Vora MT Rust Uns Rayon pll uns
Array size: 1000000
-- Unif :29544us 18644ns (29.54ns) 7118us 4118ns (7.12ns) 100826us 58221ns (100.83ns) 21880us 12634ns (21.88ns)
Array size: 10000000
-- Unif :346935us 63342ns (34.69ns) 40382us 7373ns (4.04ns) 1155803us 211045ns (115.58ns) 240198us 43888ns (24.02ns)
Array size: 100000000
-- Unif :2423071us 140773ns (24.23ns) 503343us 29067ns (5.03ns) 13256390us 179951ns (132.56ns) 2611723us 150974ns (26.12ns)
this evening I have run tests on the mac provided by my work which has only 16Go ram, if you need, I can run benchmark on my computer with 96Go ram
I just realized that the mac has 8 cores/threads new results:
Number of iterations: 3
Number of threads: 8
With check: true
=== Test struct{usize, usize} === Trait Vora Uns or stable Trait Vora MT Rust Uns Rayon pll uns
Array size: 1000000
-- Unif :29356us 18401ns (29.36ns) 6850us 3965ns (6.85ns) 98000us 56584ns (98.00ns) 20468us 11848ns (20.47ns)
Array size: 10000000
-- Unif :345722us 63121ns (34.57ns) 30626us 5595ns (3.06ns) 1140727us 208268ns (114.07ns) 235722us 43205ns (23.57ns)
Array size: 100000000
-- Unif :2304004us 133033ns (23.04ns) 355460us 20526ns (3.55ns) 13017920us 107250ns (130.18ns) 2418123us 139652ns (24.18ns)
Array size: 200000000
-- Unif :5182922us 213317ns (25.91ns) 760724us 31068ns (3.80ns) 27351699us 375056ns (136.76ns) 5539118us 226221ns (27.70ns)
test tests::speed_sort::speed_test_structusizeusize ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 93 filtered out; finished in 219.35s```
Mmhhh you're smashing the two usize into a u128 and in my experience those are pretty slow but I'll try and report.
I have implemented the partialOrd differently, and results for the rust sorts are better but still slower than the voracious sort https://github.com/lakwet/voracious_sort/blob/88255818bcad69958f8f3f3a34004c0ef8ed63be/src/types/custom.rs#L346
running 1 test
Number of iterations: 3
Number of threads: 8
With check: true
=== Test struct{usize, usize} === Trait Vora Uns or stable Trait Vora MT Rust Uns Rayon pll uns
Array size: 1000000
-- Unif :30398us 19047ns (30.40ns) 6750us 3934ns (6.75ns) 51448us 29705ns (51.45ns) 11079us 6405ns (11.08ns)
Array size: 10000000
-- Unif :346347us 63236ns (34.63ns) 32367us 5958ns (3.24ns) 583788us 106585ns (58.38ns) 113973us 20813ns (11.40ns)
Array size: 100000000
-- Unif :2388913us 137932ns (23.89ns) 372664us 21517ns (3.73ns) 6462832us 373143ns (64.63ns) 1228916us 71006ns (12.29ns)
Array size: 200000000
-- Unif 5329876us 218871ns (26.65ns) 798115us 32612ns (3.99ns) 14090880us 390340ns (70.45ns) 2582302us 105441ns (12.91ns)
test tests::speed_sort::speed_test_structusizeusize ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 93 filtered out; finished in 138.40s```
Mmhhh you're smashing the two usize into a u128 and in my experience those are pretty slow but I'll try and report.
yes indeed, since it is the key for the radix, it is done only "locally" I have to find time to do the generic case without casting into another type it was something that was already asked for a "long" time
OK, what about implementing voracious sort for arrays? That should be pretty easy—unlike pairs, you can just index the elements. I would really like to use this in the upcoming webgraph crate as we have to sort massively large sets of arcs from graphs but we're stuck with the fact that we have pairs of usize.
Hello I don't have a lot of time, I will see what I can do. It is not pretty easy, because the code is optimized everywhere, and it is not just playing with level and index. It is not very complicated though. It just requires time.
Well that would immensely extend the usability of the crate. Or a Radixable trait as rdst does...
I suppose you want that for the multithreaded version? I am looking into it right now, and it is not as easy as I thought. I will try to do it on a new branch, but not publish it for now, you will be able to use it with the branch.
And I am doing that only for usize? (it will help me to have only 1 type to handle for now)
Ok, full disclosure, we're using rdst and we're pretty happy with it, so if it's a hassle don't do it...
Is it possible to use voracious sort to sort pairs (usize, usize)? I see the trait implemented for u128, so I guess it's possible. Am I right?