Bodigrim / arithmoi

Number theory: primes, arithmetic functions, modular computations, special sequences
http://hackage.haskell.org/package/arithmoi
MIT License
147 stars 40 forks source link

Support GHC 8.4 #93

Closed Bodigrim closed 6 years ago

Bodigrim commented 6 years ago

Just for history: our test suite triggered two bugs in GHC 8.4 alphas, filed as #14754 and #14768.

I've also replaced criterion with gauge, because I got tired of dependency footprint of the former.

cartazio commented 6 years ago

does gauge provide any improvements? (is it just a copy paste of the deps criterion needs into the same package, or whats different?)

(in the presence of new build, does benchmark dep first time build overhead matter?) (and or is it otherwise more actively maintained?)

On Thu, Mar 8, 2018 at 3:38 AM, Bodigrim notifications@github.com wrote:


You can view, comment on, or merge this pull request online at:

https://github.com/cartazio/arithmoi/pull/93 Commit Summary

  • Support GHC 8.4
  • Replace criterion with gauge to reduce number of dependencies
  • Add test case for moebius sieve
  • Update .travis.yml

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cartazio/arithmoi/pull/93, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQwikh6UawCnbZDOdF8aI2YqGups4mks5tcO3tgaJpZM4SiUe2 .

Bodigrim commented 6 years ago

No, gauge is not a copy-paste of all dependencies of criterion into a single source tree. It is basically a fork of criterion, cleaned up of redundant, weird or rarely used features with an aim to minimise dependencies.

63 packages just to build criterion framework are really a huge cost in terms of package maintenance. I remember having issues with them from time to time, e. g., 53c716b. It has also been a pain in course of testing with ghc-8.4, because some of these dependencies were not ready (even with hackage.head). So I switched to gauge and it worked like a charm.

There is also a killer feature of --small option. Original criterion produces very extensive, multiline reports per benchmark. They are difficult to read, if you'd like to compare them. I used to have a special utility just to parse criterion output.

benchmarked Powers/sqrt2300/new
time                 9.923 μs   (8.787 μs .. 11.66 μs)
                     0.901 R²   (0.851 R² .. 0.950 R²)
mean                 9.166 μs   (8.715 μs .. 9.718 μs)
std dev              1.590 μs   (1.279 μs .. 1.986 μs)
variance introduced by outliers: 84% (severely inflated)

benchmarked Powers/sqrt2300/old
time                 12.98 μs   (9.992 μs .. 16.57 μs)
                     0.784 R²   (0.699 R² .. 0.924 R²)
mean                 11.95 μs   (11.26 μs .. 12.82 μs)
std dev              2.709 μs   (2.066 μs .. 4.685 μs)
variance introduced by outliers: 91% (severely inflated)

Now, gauge with --small reports them as

Powers/sqrt2300/new                      mean 7.852 μs  ( +- 799.9 ns  )
Powers/sqrt2300/old                      mean 8.552 μs  ( +- 566.7 ns  )

This is so much more tractable and can be parsed and analysed with bare awk.

cartazio commented 6 years ago

valid :)

cartazio commented 6 years ago

overall change set LGTM -- i've read the diffs but havent tested it : (and since this dep doesn't impact user builds, my concerns regarding what maintainer style it has probably dont matter/ wont impact anyone but us :) )