Closed jfcg closed 4 years ago
I also added
int samples = 2;
and the output is
| Name | Items | Type | Best | Average | Comparisons | Distribution |
| --------- | -------- | ---- | -------- | -------- | ----------- | ---------------- |
| quadsort | 67108864 | i64 | 7.294079 | 7.340462 | | random order |
|stablesort | 67108864 | i64 | 7.693744 | 7.820621 | | random order |
| wolfsort | 67108864 | i64 | 9.194093 | 9.203110 | | random order |
| quadsort | 67108864 | i32 | 6.934483 | 7.032585 | | random order |
|stablesort | 67108864 | i32 | 7.314444 | 7.478966 | | random order |
| wolfsort | 67108864 | i32 | 4.174401 | 4.180588 | | random order |
with much faster results for partially sorted input. sorty attains 3.19 seconds with two goroutines.
Considering g++ -O3
vs go -B
, g++ is considerably better at optimizing code but go toolchain is not bad either.
I think if you write a concurrent version of wolfsort, that should be even faster. Cheers
Keep in mind that wolfsort performs optimal at 1,000 - 100,000 elements.
A concurrent version should be quite a bit faster. For the best overall results one would need to make quadsort concurrent as well. I don't think I'll attempt this myself anytime soon however.
Keep in mind that wolfsort performs optimal at 1,000 - 100,000 elements.
interesting that performance peeks in an interval. is it by design?
I guess you could call it by design. Quadsort is significantly faster than quicksort as long as the L1 cache isn't exhausted. This is primarily due to the swapping optimization which gives a 50% performance boost.
So the primarily goal of wolfsort is to create buckets that contain right around 250 elements. This is easiest in the 32K - 128K range, which might explain why wolfsort is (in some of my tests) faster at sorting 100K elements than 10K elements.
In order to compare with sorty I make the following change in bench.c:
and compile with:
It is been running for over 5 minutes with no output. an issue with the benchmark code?