Closed rrnewton closed 8 years ago
I think it is an interesting idea: to try at least building the Benchmarks in stackage. Of course it will add to the maintenance burden and build-times. I dunno if it could be started off as a separate effort before integrating to Stackage itself - it would be good at least to do some initial testing first to get a better idea of how good/bad things are currently.
I tried to build some Subaru of benchmarks around Christmas, many depend on old criterion
. So even notifying maintainers about restrictive bounds would be nice
@juhp We will gather data on how many of them already build, and report that here. I hope that just building benchmarks does not add too much to the build time because Stackage already builds benchmark dependencies.
There may be a substantial one-time transitioning cost -- getting people to fix currently-broken packages. But I hope that, long term, building benchmarks won't add too much friction, because it doesn't increase the "surface area" of maintainers and packages. Rather, it's just one more correctness check packages must pass to be included.
So even notifying maintainers about restrictive bounds would be nice
As I understand it, we do notify maintainers about restrictive bounds on benchmarks, just the same as with their regular dependencies. But once they get put on the "skipped benchmarks" list, we stop notifying them.
I have no problem with turning on benchmark building. We'll just need to populate a field for expected benchmark failures pretty quickly. I'll make the tweak. @juhp I can wait to activate this until I'm back on curator duty if you'd like
@DanBurton oh, I didn't know! That's very nice.
@snoyberg awesome! Well we can try it now I guess and see how it goes. I see benchmarks are building now actually - I can try to report the initial results here when that finishes.
Yeah, sorry, I moved ahead with it right away to test out my code. If it causes you trouble, let me know and I'll roll it back.
On Thu, Apr 21, 2016 at 11:08 AM, Jens Petersen notifications@github.com wrote:
@snoyberg https://github.com/snoyberg awesome! Well we can try it now I guess and see how it goes. I see benchmarks are building now actually - I can try to report the initial results here when that finishes.
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/fpco/stackage/issues/1372#issuecomment-212797670
The following benchmarks appear to be failing:
Not pinging maintainers on this (though I'll look into a few of those myself).
thanks that's great
Can we close this now?
Yes, I think so
Just FYI we are now successfully grabbing and running >66 benchmark suites and recording their results. We've got various build errors that don't 100% match up with the list above, and we can gradually help poke maintainers or send PRs.
Many of them are simple fixes like missing other-modules
.
@rrnewton Thank you. If you do want to create an issue to track progress you can do that, otherwise I guess the current Stackage status can be checked in the "Expected benchmark failures" section in build-constraints.yaml
at least.
I'm currently hunting down Stackage benchmark failures separately in this issue. As these packages get fixed and uploaded to Hackage, I'll submit PRs to this repo to add their benchmarks back into the fold.
There has been discussion on ghc-devs about revitalizing the Haskell benchmarking suites. Stackage benchmark suites would appear to be one useful source of data. Indeed, there are almost 200 benchmark suites among the ~1,700 current LTS packages.
However, running Stackage benchamarks is currently hard to achieve, for several reasons:
cabal sdist
are often wrong (seepipes
, lts-5.13 which is missing a file)lens-4.13
will error when you try to build their benchmarks.While I understand that benchmarks were left out of the original stackage focus, is it perhaps time to start tightening the screws and weeding out these invalid tarballs?
Perhaps a first step would be a warning mechanism. The maintainers agreement has some hard deadlines, but perhaps failing benchmark suites builds could only be a soft warning for some period of time, and then later become a hard requirement along with the rest.
(CC @RyanGlScott @vollmerm @osa1)