Benchmark binaryGCD and coprime against Prelude alternatives

jgertm commented 6 years ago

I took a stab at #80.

So far, I have only benchmarked cases 1 and 2. These are the preliminary results on my box:

I'd appreciate any feedback, aspecially regarding the benchmark cases.

Bodigrim commented 6 years ago

Awesome! I suspected that our not-invented-here binaryGCD is not that good, but your results show that it is tremendously worse.

Bodigrim commented 6 years ago

@jgertm which version of GHC are you using?

I run your benchmark on my box with GHC 8.4.1 x64:

GCD/large coprimes/Prelude.gcd           mean 67.30 ns  ( +- 1.517 ns  )
GCD/large coprimes/binaryGCD             mean 120.1 ns  ( +- 2.085 ns  )
GCD/large coprimes/Prelude.coprime       mean 68.65 ns  ( +- 4.798 ns  )
GCD/large coprimes/coprime               mean 123.7 ns  ( +- 7.513 ns  )

GCD/powers of 2/Prelude.gcd              mean 48.41 ns  ( +- 1.175 ns  )
GCD/powers of 2/binaryGCD                mean 30.72 ns  ( +- 781.1 ps  )
GCD/powers of 2/Prelude.coprime          mean 47.16 ns  ( +- 1.209 ns  )
GCD/powers of 2/coprime                  mean 23.58 ns  ( +- 437.1 ps  )

GCD/power of 23/Prelude.gcd              mean 39.66 ns  ( +- 4.347 ns  )
GCD/power of 23/binaryGCD                mean 65.26 ns  ( +- 10.23 ns  )
GCD/power of 23/Prelude.coprime          mean 37.00 ns  ( +- 2.705 ns  )
GCD/power of 23/coprime                  mean 66.35 ns  ( +- 13.97 ns  )

It is interesting that on my box binaryGCD is 30% faster on powers of two, but on your box it is actually slower.

Bodigrim commented 6 years ago

binaryGCD heavily relies on Math.NumberTheory.Utils.trailZeros#. Newer GHCs (>= 7.10) has a built-in ctz# function, which can be used instead. Putting

trailZeros# :: Word# -> Int#
trailZeros# w = word2Int# (ctz# w)

into Math.NumberTheory.Utils results in following benchmarks:

GCD/large coprimes/Prelude.gcd           mean 67.86 ns  ( +- 964.1 ps  )
GCD/large coprimes/binaryGCD             mean 57.80 ns  ( +- 2.449 ns  )
GCD/large coprimes/Prelude.coprime       mean 67.86 ns  ( +- 2.095 ns  )
GCD/large coprimes/coprime               mean 58.43 ns  ( +- 10.66 ns  )

GCD/powers of 2/Prelude.gcd              mean 49.17 ns  ( +- 4.313 ns  )
GCD/powers of 2/binaryGCD                mean 26.42 ns  ( +- 2.844 ns  )
GCD/powers of 2/Prelude.coprime          mean 47.54 ns  ( +- 1.062 ns  )
GCD/powers of 2/coprime                  mean 23.61 ns  ( +- 260.5 ps  )

GCD/power of 23/Prelude.gcd              mean 38.32 ns  ( +- 1.663 ns  )
GCD/power of 23/binaryGCD                mean 40.04 ns  ( +- 1.119 ns  )
GCD/power of 23/Prelude.coprime          mean 36.91 ns  ( +- 1.700 ns  )
GCD/power of 23/coprime                  mean 37.02 ns  ( +- 816.6 ps  )

Here binaryGCD is either as fast as Prelude.gcd, or up to 2x faster. @jgertm could you please try to reproduce this on your box?

Bodigrim commented 6 years ago

I'll merge it for now due to upcoming migration from criterion to gauge, thanks for contribution!

Bodigrim / arithmoi

Benchmark binaryGCD and coprime against Prelude alternatives #92