LilithHafner / InterLanguageSortingComparisons

6 stars 2 forks source link

Add Glidesort #8

Closed LilithHafner closed 1 year ago

LilithHafner commented 1 year ago

glidesort underperforms rust's default sorting algorithm in this benchmark. @orlp, would you like to chime in before I merge this?

plot showing glidesort consistently taking longer than rust's default sorting algorithm

orlp commented 1 year ago

@orlp, would you like to chime in before I merge this?

@LilithHafner Well, you compared against Rust's unstable sorting algorithm (as well as unstable sorts in other languages). I don't claim glidesort is faster than that on all platforms (although it is on Apple M1). Glidesort is a stable sorting algorithm, and thus competes against [T]::sort rather than [T]::sort_unstable, std::stable_sort in C++, etc.

marcospb19 commented 1 year ago

To avoid misunderstandings, I think the graph should compare glidesort only with stable algorithms, if you really want to compare with others tho, an alternative is having all stable algorithms be a line, and all unstable be a dotted line, and have a caption explaining it :+1:.

LilithHafner commented 1 year ago

Thanks for your feedback! I want to reserve the dotted/solid distinction for default/primary sorting algorithms vs. non-default/alternate, but I added a note mentioning which algorithms are stable. I think it is acceptable to present this information below the graph rather than on it because

marcospb19 commented 1 year ago

Can you change the Rust benchmark to call the binary instead of cargo, like you do with the c benchmark?

EDIT: I don't know go, but I think it's compiled so same applies, probably.

LilithHafner commented 1 year ago

I'm plotting differential execution time. Invocation time is irrelevant. Please read the explanation in the README and/or the implementation. Each language is responsible for timing itself. In the case of rust (and all other languages), I make a single invocation of rust which then loops over various list sizes.

marcospb19 commented 1 year ago

You're right, thanks for your response, if you don't mind I deleted some comments of mine with wrong information about this :smile:.

Sorry for misunderstanding the code, I got confused by what the f function does.

exrok commented 1 year ago

I think the reason that glidesort performs worse than expected here is the possible panicking/extra checks, in the comparison function. In the std library unstable_sort

x.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap());

and

x.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap_or(Ordering::Equal));

only have about a 20% difference in runtime (30% difference in instruction count), on my machine a Ryzen 9 5900X desktop.

However, with glidesort the difference is much more. The panic-able one is 50% slower.

glidesort::sort_by(&mut x, |a, b| a.partial_cmp(b).unwrap());
       //       32.97 msec task-clock:u 
       // 149,158,550      cycles:u
       // 491,913,508      instructions:u
       //  35,433,567      branches:u

Then if we assume NaN's are equal for the sake of sorting, similarly but not exactly to what Julia does.

glidesort::sort_by(&mut x, |a, b| a.partial_cmp(b).unwrap_or(Ordering::Equal));
       //       21.65 msec task-clock:u  
       //  99,184,881      cycles:u  
       // 305,739,539      instructions:u  
       //  13,097,776      branches:u      

When we don't use a panicking unwrap glidesort ends up faster than the std unstable sort, atleast on my machine.

LilithHafner commented 1 year ago

That proposal gives incorrect answers in the presence of NaN.

use std::cmp::Ordering;

fn main() {
    let mut v: [f64; 4] = [4.0, 0.0/0.0, 2.0, 1.0];
    println!("Input: {:?}", v);
    glidesort::sort_by(&mut v, |a, b| a.partial_cmp(b).unwrap_or(Ordering::Equal));
    println!("Sorted: {:?}", v);
}
Input: [4.0, NaN, 2.0, 1.0]
Sorted: [1.0, 4.0, NaN, 2.0]

See a similar proposal and additional discussion at #6.

The function being benchmarked should be correct for all inputs, even though I don't and can't test all inputs.

orlp commented 1 year ago

There is f64::total_cmp for ordering floats with NaNs.

Alternatively there's also https://docs.rs/ordered-float/latest/ordered_float/struct.NotNan.html that allows you to check whether all floats are not NaN during a conversion step after which you it uses regular comparison for the floats. This is a lot faster than unwraping on every comparison.

LilithHafner commented 1 year ago

a.total_cmp(b) and a.partial_cmp(b).unwrap() have similar performance on this benchmark, at least for the default rust sorting algorithm. Perhaps most of the time is spent elsewhere.

Here is the output of the benchmarking code with the two versions: ``` shell> cargo run -q --release x.sort_unstable_by(|a, b| a.partial_cmp(b).unwrap()); 1 16.3 (78%) 0.05551847594094819 2 13.0 (74%) 0.07417316576996724 3 35.7 (87%) 0.08027258530101522 4 36.9 (76%) 0.08224016169942966 5 60.7 (93%) 0.08296157166153631 6 31.3 (86%) 0.08322850879748311 7 56.1 (92%) 0.0833036172473729 8 43.4 (90%) 0.08330484801845664 9 48.2 (91%) 0.08337267440633764 10 44.9 (91%) 0.08333937470816669 11 52.4 (91%) 0.08336282206174409 12 51.8 (92%) 0.08332514670620222 13 56.3 (91%) 0.08350351306258078 14 59.6 (92%) 0.08335795248493971 15 68.5 (92%) 0.0832653987167391 shell> cargo run -q --release x.sort_unstable_by(|a, b| a.total_cmp(b)); 1 7.1 (58%) 0.05555080135323748 2 14.0 (75%) 0.074098612213434 3 21.4 (81%) 0.0802705686865759 4 17.8 (54%) 0.08229273807072943 5 35.9 (84%) 0.08305543722843758 6 34.8 (86%) 0.08325603696770698 7 49.1 (90%) 0.08335280809292911 8 48.2 (91%) 0.0833391267511785 9 52.0 (91%) 0.08334891233385663 10 49.0 (91%) 0.08333424534952608 11 55.6 (91%) 0.0834305041907687 12 57.3 (91%) 0.08330698053451618 13 61.5 (91%) 0.08331396287464021 14 66.8 (91%) 0.08341090235643989 15 75.4 (90%) 0.0832241402649414 ```

Equally importantly, as I mentioned in #6, this repo is for measuring default sorting algorithms and sometimes non-default sorting algorithms with default settings. Requiring the user to preprocess their input and use a different comparison function does not count as a "default algorithm".

orlp commented 1 year ago

@LilithHafner I think that's fair enough that you want to compare 'the default', but I don't think you're being entirely fair here as to what's allowed as 'the default'.

That proposal gives incorrect answers in the presence of NaN.

It might give incorrect answers in the presence of NaN, but so does your C++ implementation, which doesn't handle NaN at all. In fact, your C++ implementation is worse, it is undefined behavior for inputs with NaNs, and might segfault.

LilithHafner commented 1 year ago

Good catch, thanks! That is definitely an issue, though it is orthogonal to this PR.

LilithHafner commented 1 year ago

If there are no unresolved objections, I'll merge in a few days