haskellfoundation / stability

Issues and proposals related to the HF Stability Working Group
25 stars 2 forks source link

Maintainership standards #7

Open goldfirere opened 2 years ago

goldfirere commented 2 years ago

I was cruising by and remembered some conversations I had had with @tomjaguarpaw about best practices in package maintainership. I wonder whether this group would be a good place to flesh out such standards and work to get them into part of the general Haskell community workflow. I'm opening this as an Issue, not a PR, because it's not all fully-formed enough in my head to write up a formal proposal. Yet the idea is well formed enough that I think writing something is better than nothing. Do feel free to just close this if this is too far from your immediate agenda and is unhelpful.


Goal: Have the community recognize some subset of packages as well maintained. These packages might be highlighted with a badge on Hackage, for example. The packages are then understood to be more likely to be reliable into the future, and more worthy of being the basis of, say, a commercial enterprise.

Non-goal: Getting every Haskell contributor to follow these standards or to mandate anything, anywhere. This is all opt-in.

Bronze-Standard Maintenance:

Silver-Standard Maintenance:

Gold-Standard Maintenance:

Platinum-Standard Maintenance:

Certifying the standard to which a package adheres clearly takes some human intervention, but most of the tests above can be automated. A package would have to request getting the badge in order for a human to assess it. This kind of assessment might be something the HF could take on. I do think having these maintenance levels would be very useful to people deciding on what packages to use in their mission-critical project!

bergmark commented 2 years ago

I think this is relevant to the working group, it covers a lot of points that I had in mind for my yet-to-leave-my-brain proposal around maintenance. The majority of the listed requirements make sense to me.

I would not be surprised if we get a lot of bike-shedding around what the exact requirements should be. Do you need to meet every requirement? What if you are one checkbox away from leveling up to silver, but you meet several of the requirements for gold? Is it OK to disregard a requirement if you have a good reason? How do we evolve the requirements?

Some other requirements that I would consider:

It seems like we could get something rolling with a reasonable amount of up-front effort.

I would like to have documentation on suggested ways to meet these requirements to make it easier for people to join. What's an easy way to set up CI? Is there a HF approved CoC that I can copy into my project?

Some notes on the proposed requirements:

The package lists a maintainer email address. Emails to this address are answered within 2 weeks. (Exceptions allowed around holidays.)

I prefer issue trackers, they are public and allow people who are not the maintainer to assist. Having either is fine in my book.

The package includes a license.

A FLOSS license, or can it be all rights reserved?

The package includes a testsuite

Does this test suite need to do anything? Is the intention to make it easy to add tests later on or should it cover some basic functionality?

The package is included in a Stackage LTS.

Here is one instance where I think we should accept a good excuse.

For the past three releases of GHC (including minor releases), the package was up-to-date (the testsuite passed) within 1 month of GHC release.

I think this would cause a lot of churn, as things stand now, GHC adoption takes a lot longer. And what if you depend on another silver package that does not update until a month has passed? I like the gist of this, but not sure how it should be formalized into a requirement.

The testsuite is analyzed for code coverage and covers at least 95% of the code in the package.

This is contentious for me, personally. We could get into the debate on usefulness of strict code coverage requirements. I haven't used it much in Haskell, is the tooling good enough where having e.g. doctests, regular tests, and system tests (running a separate application) can be combined into one coverage report (that's where I ran into trouble when trying to measure code coverage in Rust)?

I'm 👍 on the initative

What is the next step? Wait a while to see if we get more feedback here? Discuss it at the WG meeting? Formalize the proposal?

goldfirere commented 2 years ago

Do you need to meet every requirement? What if you are one checkbox away from leveling up to silver, but you meet several of the requirements for gold? Is it OK to disregard a requirement if you have a good reason? How do we evolve the requirements?

Good questions. Maybe we have a points system? Or no "standards", just lots of badges? I don't have good answers here, but that needn't stop us from proceeding. Maybe one way forward is to allow for a "maintainership statement" describing the particulars of an individual package. (Example, singletons meets most "Gold" requirements above, but we never aim to support more than one GHC release, for good reasons that we could document. So maybe we could apply for "Gold", where the means that we have an exception to one or more requirements, as documented in a statement.)

I would like to have documentation on suggested ways to meet these requirements to make it easier for people to join. What's an easy way to set up CI?

haskell-ci

Is there a HF approved CoC that I can copy into my project?

No, but there should be!

I prefer issue trackers, they are public and allow people who are not the maintainer to assist. Having either is fine in my book.

Agreed.

The package includes a license.

A FLOSS license, or can it be all rights reserved?

I'm personally fine with either. The goal in this initiative (for me) is to make it easier for folks to make decisions around adoption, and I don't see the status of the license (in this regard) to be a determining factor.

The package includes a testsuite

Does this test suite need to do anything? Is the intention to make it easy to add tests later on or should it cover some basic functionality?

Good points. This should be added onto.

The testsuite is analyzed for code coverage and covers at least 95% of the code in the package.

This is contentious for me, personally.

I just threw that in. I've never checked for code coverage, nor do I have any particular knowledge about whether code coverage checking is useful in practice. I was just thinking of ways of raising the bar. :)

I'm 👍 on the initative

Great. :)

gbaz commented 2 years ago

Note that hackage already requires a FLOSS license for upload.

Vis a vis "The package's testsuite includes performance tests" -- this is one area where technical work could be of use. Behavior tests are well supported between cabal and the wide ecosystem of testing packages and CI knows how to run them.

There's are libs that make performance testing straightforward. But there's no standard way to declare a performance test stanza and run it in a way that makes sense for performance tests with a library that integrates well with that spec and such that CI can "run performance tests" or the like (which is of course complicated by the fact that perf tests are necessarily unstable between different build and run environments, will vary widely on CI systems with shared load, etc). If we want to encourage people to include perf tests with their packages, we will need some sensible way of making this easy to do in a semi-standard way.

hasufell commented 2 years ago

We were just talking today about legal implications for open-source software and I proposed (jokingly) to add a liability clause to hackage.

Of course, that's not going to happen. But what would indeed be interesting is to have a "stability program" on hackage that signals end-users that this maintainer has agreed to e.g. respond to security bug reports quickly. That can be pretty important for decision making about which library to use in industry settings.

I'd even go so far to say that hackage should make it easy to fund those maintainers (e.g. via opencollective link or whatnot, or a HF maintained system). That could be incentive for maintainers to join the programme.


Wrt some details:

The package lists a place to post bugs

A public place. I had an argument with the maintainer of filepath-bytestring, because the only way to report bugs is their private email. I think this is insufficient.

The package includes a testsuite and CI infrastructure.

I disagree with CI infrastructure. It's totally fine for a maintainer to execute the testing manually. CI is about automation and reducing workload. If a maintainer chooses to do this work themselves, they should be free to do so.

The package is included in a Stackage LTS.

I disagree with that requirement. Getting your package into stackage is an entirely different matter and doesn't really signal commitment or quality. You could as well require people to get their software into debian/nixpkgs or similar. I don't think we should.

The package includes a backup maintainer (with email address), who has full technical credentials to take over the project.

I'm a little on the fence with this one. Even if you manage to find a backup maintainer, it's possible they don't actually have the time to take over the project when it becomes necessary. Instead we could require maintainers to allow faster temporary package take-overs through pre-existing hackage trustees workflow (e.g. if they don't respond within 2-4 weeks), unless they have put a backup maintainer in place.

The testsuite is analyzed for code coverage and covers at least 95% of the code in the package.

Uff :sweat_smile:

geekosaur commented 2 years ago

I'm afraid to ask how that code coverage requirement applies to something like xmonad-contrib. (Most of which is not even testable without coming up with an xvfb-based test harness and injection and analysis of X events.)

I agree with requiring a public place to post bugs, but a private one should also be provided. You do not want to publicly advertise the 0day you just discovered.

tomjaguarpaw commented 2 years ago

I would caution people against taking Richard's suggestions of certification requirements too literally. I believe they were only meant as examples to start the discussions, not examples which he wants to move forwards on this proposal with.

That said, if we treat this as a brainstorming session it could be useful to collect agreements and disagreements with these examples To that end, I agree with most of @hasufell's disagreements, but I disagree with this disagreement:

The package includes a testsuite and CI infrastructure.

I disagree with CI infrastructure. It's totally fine for a maintainer to execute the testing manually. CI is about automation and reducing workload. If a maintainer chooses to do this work themselves, they should be free to do so.

To me the benefit of public CI is like the benefit of a public bug tracker. If the pass/fail status of the test suite is public it's much easier for everyone to know the status of the library.

hasufell commented 2 years ago

To me the benefit of public CI is like the benefit of a public bug tracker. If the pass/fail status of the test suite is public it's much easier for everyone to know the status of the library.

I'd argue that public CI can and should be handled on hackage. Currently the status is this: https://github.com/haskell/hackage-server/issues/997

geekosaur commented 2 years ago

I'd argue that's asking a lot of hackage, to be honest. CI should be part of the project, and provisioned however that project wishes to do it.

hasufell commented 2 years ago

I think a compromise would be for a maintainer to publish test logs (including the commit/version it was run against) if they don't wish to install CI. I think that is just as good.

goldfirere commented 2 years ago

I would caution people against taking Richard's suggestions of certification requirements too literally. I believe they were only meant as examples to start the discussions, not examples with which he wants to move forwards on this proposal with.

Yes! The code coverage comment, in particular, was meant to be provocative, encouraging some thought outside of our usual. I have no experience or relevant training to suggest that this, in particular, is a good idea.

About CI in particular: I would want reproducible, public evidence that the testsuite passes. I think CI is the best way of producing and maintaining this evidence. But, as @hasufell suggests, some kind of manually updated public log would, I suppose, meet the requirement. From a community administration point-of-view, I think it would be easier to require CI and note that exceptions can be granted. Having wide standards with lots of ways of satisfying them can be confusing -- having a specific request is easier to understand and for the community to administer. (It may be harder for an individual contributor in a particular circumstance, which is why there should be a route for asking for exceptions.)

hasufell commented 2 years ago

How would we launch this? I feel waiting for proper hackage support may be an indefinite blocker. HF could maybe maintain such a list internally and publish it somewhere on HF site?

goldfirere commented 2 years ago

I think a v0 launch as just a page on our site is reasonable. There is certainly room for improvement, but such a page would already bring value. Someone would still have to come up with the set of standards and then design a process for packages to apply and then be vetted. Once that starts happening, there will be more incentive for Hackage to catch up.

telser commented 2 years ago

I would caution people against taking Richard's suggestions of certification requirements too literally. I believe they were only meant as examples to start the discussions, not examples with which he wants to move forwards on this proposal with.

Yes! The code coverage comment, in particular, was meant to be provocative, encouraging some thought outside of our usual. I have no experience or relevant training to suggest that this, in particular, is a good idea.

There are certainly challenges here and the xmonad example is a great highlight. I can imagine there being value in being able to say a particular package has a "good" test suite, but the definition of "good" seems highly variable.

About CI in particular: I would want reproducible, public evidence that the testsuite passes. I think CI is the best way of producing and maintaining this evidence. But, as @hasufell suggests, some kind of manually updated public log would, I suppose, meet the requirement. From a community administration point-of-view, I think it would be easier to require CI and note that exceptions can be granted. Having wide standards with lots of ways of satisfying them can be confusing -- having a specific request is easier to understand and for the community to administer. (It may be harder for an individual contributor in a particular circumstance, which is why there should be a route for asking for exceptions.)

I would go even further, personally, and say multiplatform CI as the default so we avoid some of the odd breakages that happen from time to time on different systems. Also, I'd be strongly in favor of the ability to get an exception.

@goldfirere We brought this up at the last stability meeting and concluded that bringing the discussion to the wider community would be valuable. Would you mind posting this to Discourse for further discussion?

hasufell commented 2 years ago

I would go even further, personally, and say multiplatform CI as the default so we avoid some of the odd breakages that happen from time to time on different systems.

For such demands, the HF should also have technical documentation on how to achieve that. Since I've done a lot of those myself wrt ghcup, HLS, cabal... I can safely say it isn't trivial (e.g. testing aarch64 M1).

Would you mind posting this to Discourse for further discussion?

I'm a bit worried it will end up in bikeshedding, as we partly have here (constructively). I'd propose instead to flesh out the details of the tiers beforehand.

telser commented 2 years ago

I would go even further, personally, and say multiplatform CI as the default so we avoid some of the odd breakages that happen from time to time on different systems.

For such demands, the HF should also have technical documentation on how to achieve that. Since I've done a lot of those myself wrt ghcup, HLS, cabal... I can safely say it isn't trivial (e.g. testing aarch64 M1).

I agree, and I would love to see the hackage CI be able to offer that service to the community. My thought here is if some of the initial pain can be abstracted away we could get more consistent testing and hopefully better end user experiences.

Would you mind posting this to Discourse for further discussion?

I'm a bit worried it will end up in bikeshedding, as we partly have here. I'd propose instead to flesh out the details of the tiers beforehand.

To play devils advocate a bit, might we not get a similar kind of push back with the tiers anyway?

goldfirere commented 2 years ago

Would you mind posting this to Discourse for further discussion?

Sadly, I would prefer to not do this. I am already stretched thin in my Haskell leadership responsibilities, and my hope in posting this was to present an idea for the stability committee to consider. If you think it's a good idea, I encourage you to run with it (no attribution needed!) in the way you all see fit, but I'm not currently up for creating and managing this wide-ranging discussion.

(I am explicitly agnostic on the best way forward here. Maybe Discourse is a good idea; maybe it isn't. Please don't read my comments above as suggesting that this shouldn't be posted there. They're just saying I don't wish to post there, due to lack of time.)

telser commented 2 years ago

Would you mind posting this to Discourse for further discussion?

Sadly, I would prefer to not do this. I am already stretched thin in my Haskell leadership responsibilities, and my hope in posting this was to present an idea for the stability committee to consider. If you think it's a good idea, I encourage you to run with it (no attribution needed!) in the way you all see fit, but I'm not currently up for creating and managing this wide-ranging discussion.

Absolutely understand being time limited. I'll bring it back to the committee to see how the group wishes to proceed. Which includes taking the feedback from @hasufell on fleshing it out here more first.

(I am explicitly agnostic on the best way forward here. Maybe Discourse is a good idea; maybe it isn't. Please don't read my comments above as suggesting that this shouldn't be posted there. They're just saying I don't wish to post there, due to lack of time.)

No matter the path forward, I think the discussion is healthy and very much appreciate these kinds of ideas being brought forward. So thank you @goldfirere

telser commented 2 years ago

@hasufell I apologize for not responding directly here sooner. We came to the conclusion of being bandwidth limited for this right now. I think the conversation here has been great and for any interested we should try to continue it here. Or if someone is available to really drive this forward that would be fantastic!

chshersh commented 2 years ago

As a person who maintained 40+ packages over the last 4 years and learned tons of maintainership best-practices, I can offer some feedback on this subject and I hope it can help to move this issue forward.

Specifically, the most important question here is:

What is the purpose of establishing maintainership standards?

I see the proposed list and discussion around as a set of requirements for maintainers. But Haskell libraries are maintained by volunteers in their free time. You can't ask them to do more. Unless you provide some incentives for people to follow established standards. For example, in a form of funding.

4 years ago I was annoyed with people not writing migration guides in their Haskell packages. 4 years later I'm now "Well, duh, those people do this in their free time. Thank them for sacrificing their time and doing at least something. They don't have to do this at all!"

So, any initiative that asks people to do more without providing a clear motivation (and ideally a good motivation so people would want to do this extra work) is doomed. Volunteers can do a lot using their enthusiasm. But enthusiasm tends to end pretty quickly, especially with no encouragement or monetary support. We don't even have an official comprehensible step-by-step guide on Hackage with screenshots on how to publish a package to Hackage, what maintainership standard are we even talking about 😒


On a more technical subject, I suggest coming up with a list of things that can be verified automatically, without involving people to evaluate packages. If we don't have money or strong reasons to encourage people to maintain libraries, we won't have ways to motivate experts to evaluate packages.

Most of the good things we have in Haskell nowadays are created thanks to passionate people who had contributed good stuff before they burnt out. Only a few people are working professionally on improving the Haskell ecosystem and not on every aspect of it.

As a starting point, I could recommend writing instructions on how to improve the maintainability of packages to people who are willing to spend more time. This could be a checklist of things people can do to increase the overall maintainability and quality of their packages with instructions on how to do it.

From my experience, I can come up with the following list (keeping in mind that these things can be checked automatically):

Later, tooling can be developed to help maintainers. Previously, we had an idea for developing the Cabal linter (that could help with checking some of the fields above):

hasufell commented 2 years ago

So, any initiative that asks people to do more without providing a clear motivation (and ideally a good motivation so people would want to do this extra work) is doomed.

I think for now the only motivation that can be provided is the free advertisement. I'd argue that for a lot of packages, it might not even be that much of extra work.

To me, the most important part of the maintainership standard is to steer maintainers into getting backup maintainers on board. The rest should be light formal requirements, that can be easily achieved.

Having backup maintainers helps them and everyone else, regardless of HF endorsement. So it's kind of nudging people into thinking about their project state past their own time and commitment capacities.

For backup maintainers, the motivation would then be to be a listed maintainer for an endorsed library. Kinda cool, isn't it?

I mean yeah... these are all social motivations. But IMO, network effects can be quite powerful. Of course, they won't write a new compiler for you. But we're not expecting that. I doubt that throwing money at this problem is going to help. It's not a research problem, nor a difficult engineering task. It's just that extra mile.

However, the HF is also working on programs to fund ecosystem efforts afair, but I don't know the state of these matters. My guess would be that libraries with high maintainer standards may do better in such a funding selection process, too.

gbaz commented 2 years ago

I think that cabal would welcome a PR adding linting -- perhaps by adding another "level" to the existing "cabal check" beyond warn and error for linting notices, and a flag to turn them on.

(and hackage could certainly add linting notices somewhere in its package display, off to the side behind a link perhaps)

noughtmare commented 2 years ago

I think the most important purpose of these maintainership standards would be as a quality mark so that users can quickly tell what to expect when considering to use a package. I don't think that per se requires additional motivation for package authors.

goldfirere commented 2 years ago

@hasufell:

To me, the most important part of the maintainership standard is to steer maintainers into getting backup maintainers on board. The rest should be light formal requirements, that can be easily achieved.

@noughtmare:

I think the most important purpose of these maintainership standards would be as a quality mark so that users can quickly tell what to expect when considering to use a package.

These capture my thoughts exactly. This is about making agreed-upon, optional standards authors could choose to subscribe to, with the ability for potential users to see whether a package has met the standards.

bergmark commented 2 years ago

On a more technical subject, I suggest coming up with a list of things that can be verified automatically, without involving people to evaluate packages.

I think this could be a great way to drive this forward. Some of my concerns with bootstrapping this was "what do we check for?", "how do we check for it?", "who will do it"? Perhaps we can pick a few things that are easy to automate, and I imagine it would be considerably less work to get something shipped. E.g. "There will be a star next to your package in hackage listings if you have two listed maintainers" seems achievable with limited effort.

chshersh commented 2 years ago

if you have two listed maintainers

This can be impossible to implement because some packages might not be maintained by specific people but by the entire organization instead. Therefore, they have only one listed maintainer — the org.

But I agree with the rest. We can pick up some set of things that are (relatively) easy to check automatically and provide a badge for packages that adopted "maintainership standards". Not making the adherence to the maintainership standards a requirement to gatekeep people from smth but instead providing motivation in a form of a badge will go a long way 👍🏻

I guess, at this point, the open questions are: