scandum / fluxsort

A fast branchless stable quicksort / mergesort hybrid that is highly adaptive.
The Unlicense
678 stars 21 forks source link

Improve benchmark #2

Open GravisZro opened 3 years ago

GravisZro commented 3 years ago

Time is a lousy benchmark, especially with a low numbers of elements running on an OS. There a lot of way of profiling but I would suggest using an LLVM based profiling solution like this one because it measures the number of intermediate language instructions executed. This avoids compiler/microcode optimizations and gives you the most representative output for all platforms.

scandum commented 2 years ago

Wouldn't that fail to measure things like cache misses and branch mispredictions? It would be a useful metric to keep track of though.

What I have currently is within 1% accurate by measuring the best run out of 100. For low numbers of elements the number of repetitions can be increased to reduce the relative overhead of gettimeofday().

GravisZro commented 2 years ago

The idea is to measure the speed of the algorithm without regard to machine it runs on, specifically because not all machines are created equal. It's fine to have "real world" results as well (though I would include more platforms) but nothing about them are empirical.