segmentio / asm

Go library providing algorithms optimized to leverage the characteristics of modern CPUs
MIT No Attribution
869 stars 36 forks source link

qsort: add benchmarks from the standard library #77

Closed kevinburkesegment closed 2 years ago

kevinburkesegment commented 2 years ago

These will let us be more confident that the qsort makes sense to use, especially when e.g. the Go standard library introduces a new sorting algorithm, we can compare the two.

On my Mac, using the pdqsort from Go 1.19, the relevant timings are:

Sort8/100000-10             4.91ms ± 0%
Sort8/1000000-10            60.1ms ± 0%
StdlibSort8/100000-10       8.16ms ± 0%
StdlibSort8/1000000-10       103ms ± 0%

Sort8/100000-10            163MB/s ± 0%
Sort8/1000000-10           133MB/s ± 0%
StdlibSort8/100000-10     98.0MB/s ± 0%
StdlibSort8/1000000-10    77.5MB/s ± 0%

Compared to the Go 1.18 sort algorithm:

Sort8/100000-10             4.90ms ± 0%
Sort8/1000000-10            59.0ms ± 0%
StdlibSort8/100000-10       8.53ms ± 0%
StdlibSort8/1000000-10       105ms ± 0%

Sort8/100000-10            163MB/s ± 0%
Sort8/1000000-10           136MB/s ± 0%
StdlibSort8/100000-10     93.8MB/s ± 0%
StdlibSort8/1000000-10    76.0MB/s ± 0%

So the Go standard library sort performance is improving but we are still about 40% faster than the benchmark.

achille-roussel commented 2 years ago

For completeness, adding a snapshot for a run on a x86:

goos: linux
goarch: amd64
pkg: github.com/segmentio/asm/qsort
cpu: Intel(R) Xeon(R) CPU @ 2.60GHz

BenchmarkSort8/100000
BenchmarkSort8/100000            625       1916996 ns/op     417.32 MB/s
BenchmarkSort8/1000000
BenchmarkSort8/1000000            49      24289400 ns/op     329.36 MB/s

BenchmarkStdlibSort8/100000
BenchmarkStdlibSort8/100000                   86      11709887 ns/op      68.32 MB/s
BenchmarkStdlibSort8/1000000
BenchmarkStdlibSort8/1000000                   7     144533331 ns/op      55.35 MB/s