commercialhaskell / stackage

Stable Haskell package sets: vetted consistent packages from Hackage
https://www.stackage.org/
MIT License
530 stars 805 forks source link

Start building benchmarks, gradually #1372

Closed rrnewton closed 8 years ago

rrnewton commented 8 years ago

There has been discussion on ghc-devs about revitalizing the Haskell benchmarking suites. Stackage benchmark suites would appear to be one useful source of data. Indeed, there are almost 200 benchmark suites among the ~1,700 current LTS packages.

However, running Stackage benchamarks is currently hard to achieve, for several reasons:

While I understand that benchmarks were left out of the original stackage focus, is it perhaps time to start tightening the screws and weeding out these invalid tarballs?

Perhaps a first step would be a warning mechanism. The maintainers agreement has some hard deadlines, but perhaps failing benchmark suites builds could only be a soft warning for some period of time, and then later become a hard requirement along with the rest.

(CC @RyanGlScott @vollmerm @osa1)

juhp commented 8 years ago

I think it is an interesting idea: to try at least building the Benchmarks in stackage. Of course it will add to the maintenance burden and build-times. I dunno if it could be started off as a separate effort before integrating to Stackage itself - it would be good at least to do some initial testing first to get a better idea of how good/bad things are currently.

phadej commented 8 years ago

I tried to build some Subaru of benchmarks around Christmas, many depend on old criterion. So even notifying maintainers about restrictive bounds would be nice

rrnewton commented 8 years ago

@juhp We will gather data on how many of them already build, and report that here. I hope that just building benchmarks does not add too much to the build time because Stackage already builds benchmark dependencies.

There may be a substantial one-time transitioning cost -- getting people to fix currently-broken packages. But I hope that, long term, building benchmarks won't add too much friction, because it doesn't increase the "surface area" of maintainers and packages. Rather, it's just one more correctness check packages must pass to be included.

DanBurton commented 8 years ago

So even notifying maintainers about restrictive bounds would be nice

As I understand it, we do notify maintainers about restrictive bounds on benchmarks, just the same as with their regular dependencies. But once they get put on the "skipped benchmarks" list, we stop notifying them.

snoyberg commented 8 years ago

I have no problem with turning on benchmark building. We'll just need to populate a field for expected benchmark failures pretty quickly. I'll make the tweak. @juhp I can wait to activate this until I'm back on curator duty if you'd like

phadej commented 8 years ago

@DanBurton oh, I didn't know! That's very nice.

juhp commented 8 years ago

@snoyberg awesome! Well we can try it now I guess and see how it goes. I see benchmarks are building now actually - I can try to report the initial results here when that finishes.

snoyberg commented 8 years ago

Yeah, sorry, I moved ahead with it right away to test out my code. If it causes you trouble, let me know and I'll roll it back.

On Thu, Apr 21, 2016 at 11:08 AM, Jens Petersen notifications@github.com wrote:

@snoyberg https://github.com/snoyberg awesome! Well we can try it now I guess and see how it goes. I see benchmarks are building now actually - I can try to report the initial results here when that finishes.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/fpco/stackage/issues/1372#issuecomment-212797670

snoyberg commented 8 years ago

The following benchmarks appear to be failing:

Not pinging maintainers on this (though I'll look into a few of those myself).

juhp commented 8 years ago

thanks that's great

juhp commented 8 years ago

Can we close this now?

snoyberg commented 8 years ago

Yes, I think so

rrnewton commented 8 years ago

Just FYI we are now successfully grabbing and running >66 benchmark suites and recording their results. We've got various build errors that don't 100% match up with the list above, and we can gradually help poke maintainers or send PRs.

Many of them are simple fixes like missing other-modules.

juhp commented 8 years ago

@rrnewton Thank you. If you do want to create an issue to track progress you can do that, otherwise I guess the current Stackage status can be checked in the "Expected benchmark failures" section in build-constraints.yaml at least.

RyanGlScott commented 8 years ago

I'm currently hunting down Stackage benchmark failures separately in this issue. As these packages get fixed and uploaded to Hackage, I'll submit PRs to this repo to add their benchmarks back into the fold.