NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.58k stars 13.74k forks source link

How do we avoid issue & PR rot? (was: [Nix-dev] Too many open issues) #17407

Closed obadz closed 7 years ago

obadz commented 8 years ago

We've had a good discussion on nix-dev on how to make sure issues and PRs don't rot, but no definite resolution came out of it, and now the thread seems to have gone quiet. (It can be found at http://lists.science.uu.nl/pipermail/nix-dev/2016-July/thread.html in the thread called "Too many open issues".)

While it's clear that @edolstra has to make the final call on any process that gets put in place, it can't hurt to lay out the options and take a quick contributor poll.

In that spirit, I propose that everyone who has a proposal to make lays it out as a comment of this issue and all contributors should feel free to express their view with a :+1: or :-1: on each of the proposals (and also write comments of course). Proposals aren't necessarily mutually exclusive.

I will also update this top message with an index all the proposals in case they get diluted in comments:

cc @wmertens @ocharles @CMCDragonkai @domenkozar @zimbatm @DamienCassou @spinus @kevincox @rasendubi @grahamc @matthiasbeyer @layus @Azulinho @bennofs @Profpatsch @kampfschlaefer @deepfire @vcunat @sjmackenzie

sjmackenzie commented 8 years ago

Proposal: Adopt C4 completely, no differentiation between core and community: http://rfc.zeromq.org/spec:42/C4/

obadz commented 8 years ago

Proposal: Use the assignee as a triaging tool and to achieve de-duplication of effort

Assignment in this context would mean: "I think you know more about this than I do, feel free to pass it on to someone else if aren't the right person or can't handle this with the appropriate urgency", and not "I am asking you to do more work".

Triaging would then become relatively easy: go through issues & PRs that do not have an assignee and use @mentionbot's output, as well as git log/blame and personal knowledge to come up with a best guess assignee. Assignees will certainly be better placed to further refine that guess…

Occasionally we would have to go through issues & PRs that were assigned but where the assignee has become unresponsive and re-triage those.

Another advantage is that looking through issues & PRs becomes very easy: either I'm triaging and querying for no:assignee or I'm looking at PRs & issues where I'm the assignee.

One disadvantage is that experienced contributors will end up with a disproportionately large share of PRs/issues. That is however mitigated by the fact that they can assign back to a less experienced contributors, while offering mentorship on how to resolve if the issue at hand is truly tricky.

For instance, if I assign a PR to @vcunat because I'm uncomfortable tackling it and he assigns it back to me, I should then feel much less timid about reviewing & merging it because a senior contributor has opined that I'm reasonably unlikely to damage things.

bennofs commented 8 years ago

Repeating what I've said on the mailing list, I don't like @obadz proposal because it makes a core function of assigning unusable: I can no longer search for un-assigned issues to find issues that need somebody to work on, because many issues will be assigned even when the one assigned to the issue is not actively working on it.

I'm also not entirely happy with the idea of pushing somebody to work on a particular issue (assigning them), instead of the usual voluntary way of somebody taking that issue on their own.

Profpatsch commented 8 years ago

About C4: It requires that each patch/proposal needs to open an issue before implementing. I think that could even worsen our current issue number situation (and add a lot of stale ones that are never followed through).

sjmackenzie commented 8 years ago

@Profpatsch Not so sir

If the Platform implements pull requests as issues, a Contributor MAY directly send a pull request without logging a separate issue. Point number 10 of Development Process (self plug, that point was my sole contribution to the C4 process :-) )

obadz commented 8 years ago

@sjmackenzie, so it would worsen the PRs but not necessarily the issues. Still quite onerous for a project like nixpkgs.

sjmackenzie commented 8 years ago

How so? PRs are merged quickly.

obadz commented 8 years ago

PR rot is part of what we're discussing here. Many commits now go straight into the repo without going through a PR. If you cut that flow and force 1 PR per commit, it's going to make that flow even harder to tame…

sjmackenzie commented 8 years ago

@obadz * Maintainers SHALL NOT make value judgments on correct patches. * Maintainers SHALL merge correct patches from other Contributors rapidly. * Maintainers MAY merge incorrect patches from other Contributors with the goals of (a) ending fruitless discussions, (b) capturing toxic patches in the historical record, (c) engaging with the Contributor on improving their patch quality.

No it's not necessarily 1 commit per PR.

obadz commented 8 years ago

@bennofs, I hear you, it's definitely not a silver bullet.

I don't think the issue you described would get much worse with the proposed change than in the status quo though: currently only 8% of open issues have an assignee. And you would still be able to swoop in on an assigned issue, assign it to yourself and work on it just like you do now…

obadz commented 8 years ago

@sjmackenzie still the assumption here is that there is no shortage of maintainer time to assess correctness and merge all those PRs.

sjmackenzie commented 8 years ago

@obadz There won't be a shortage:
* A new Contributor who makes correct patches, who clearly understands the project goals, and the process SHOULD be invited to become a Maintainer.

Also, patch correctness is dictated by the 2.3 Patch Requirements section. Value judgements and code correctness is not patch correctness. Maintainers only look for patch correctness. As such, incorrect code is corrected with a new patch. The code base becomes a kind of wiki, a concept which has already been proven by Wikipedia.

In other words, a maintainer can at a glance determine whether a patch correctness is satisfied and can hit the merge button within seconds. Hence PR staleness comes down to a fraction of what it currently is.

Profpatsch commented 8 years ago

Maintainers only look for patch correctness. As such, incorrect code is corrected with a new patch. The code base becomes a kind of wiki, a concept which has already been proven by Wikipedia.

This also means we need a second process for ensuring code and functionality correctness for releases.

sjmackenzie commented 8 years ago

This model tends towards master being release worthy. In essense after the community become accustomed to this process one could make NixOS a rolling release, similar to what Arch Linux is. CI would also play a role in this, whether it means making changes to hydra or using travis... or implementing incremental compilation.

7c6f434c commented 8 years ago

@sjmackenzie

There will be a shortage of maintainers anyway, as I have advised some people to ask for maintainer rights and they refused to; I have no reason to expect this will diminish.

Patch correctness is not trivial to judge, because for a glibc upgrade checking 2.3.6 takes a lot of time.

Given the rebuild time/space requirements, there are and there will be value judgements in applying patches to master or staging/glibc-upgrade/whatever.

C4:2.3.7 is a net negative for NixPkgs in the MUST form (because we have some package upgrades with minor fixes)

In general. I think C4 implicitly assumes a project with a build/test time under an hour.

sjmackenzie commented 8 years ago

Yes, I recall our long private email thread on the subject. Things have only gotten worse since then. So I will copy and paste the patch requirements here so everyone is clear what a correct patch means:

2.3.1) Maintainers and Contributors MUST have a Platform account and SHOULD use their real names or a well-known alias.
2.3.2) A patch SHOULD be a minimal and accurate answer to exactly one identified and agreed problem.
2.3.3) A patch MUST adhere to the code style guidelines of the project if these are defined.
2.3.4) A patch MUST adhere to the "Evolution of Public Contracts" guidelines defined below.
2.3.5) A patch SHALL NOT include non-trivial code from other projects unless the Contributor is the original author of that code.
2.3.6) A patch MUST compile cleanly and pass project self-tests on at least the principle target platform.
2.3.7) A patch commit message MUST consist of a single short (less than 50 characters) line stating the problem ("Problem: ...") being solved, followed by a blank line and then the proposed solution ("Solution: ...").
2.3.8) A "Correct Patch" is one that satisfies the above requirements.

We change the building process so that we can get it under an hour for most cases. Incremental Compilation is a possible solution. Zipf's law must be factored into this, we cannot just discard a more efficient process just because a change to glibc will blow the build time. Yes there will be scenarios where low level changes have huge knockon effects. People will learn and facepalm and get better over time.

2.3.7) a minor fix comes from a problem... which is still a problem so I don't see a net negative. Besides, it's super helpful to see what the problem is than read some message about what changed, by far I'd like to see why it needed to change.

I don't think there would be a shortage of maintainers if C4 was adopted, as their responsibilities are to check the above points, which is trivial compared to today's process. The current process grinds maintainers down, makes them a bottle neck and burns them out. Who wants to be a maintainer in such a system?

lucabrunox commented 8 years ago

I don't like any of the two. Too complicated and we are not arch in terms of people time. Also most of PR and issues out there are either old or hard to apply, or the proposer didn't push enough for attention.

The real issue is lack of testing coverage. If all combinations were automatically tested, we wouldn't need so much time per PR. A process that works for an open source software doesn't necessarily work for an OS.

obadz commented 8 years ago

@lethalman,

issues out there are either old or hard to apply

I'd say rather than either old or hard, it's more likely that they are old and hard.

I don't like any of the two

Do you have a different idea in mind? Or simply favor the status quo?

obadz commented 8 years ago

The real issue is lack of testing coverage. If all combinations were automatically tested, we wouldn't need so much time per PR.

It would be amazing if nix contributors could invoke Hydra inside a PR by comment something like @Hydra, build this + release tests + testXYZ1 + testXYZ2 with Hydra responding with a URL containing the queued builds, the closure of what needs to get rebuilt, and the test results.

sjmackenzie commented 8 years ago

@lethalman poppycock

sappo commented 8 years ago

Hey Guys, I'm one of the maintainers of zeromq and I have been using C4 with great success since almost 3 years. Here are some comments that did come to my mind.

@Profpatsch:

This also means we need a second process for ensuring code and functionality correctness for releases.

First I'd like to say that C4 those not create any kind of overhead or a new process indeed it's self-governing and will eliminate those processes.

@obadz:

@sjmackenzie still the assumption here is that there is no shortage of maintainer time to assess correctness and merge all those PRs.

C4 ensures that each patch defines one clear and agreed problem, and one clear, minimal, plausible solution. So commits are usually rather minimal in term of changes. This also means that a PR can consist of a couple or more commits. If someone comes with a big clumsy patch though we will deny it as it is not possible to asses correctness or rather we will ask him nicely to follow C4.

@7c6f434c of course there are exceptions i.e. upgrading glibc which is a big clumsy patche and needs to be assessed carefully but you don't upgrade glibc any other day.

@7c6f434c :

There will be a shortage of maintainers anyway, as I have advised some people to ask for maintainer rights and they refused to; I have no reason to expect this will diminish.

C4 will allow you to promote people to become maintainers therefore removing the barrier of asking and risking the humiliation by getting publicly turned down. In fact that's the reason why I did become a maintainer in the first place. Of course most people you promote wont become full time maintainers. They will perhaps only comment and merge PRs they care about or vanish for a couple of month before doing any maintaining activities. But still in general you will end up with more maintainers. Those maintainers you promote and who'll never become active you can safely remove after some time. Though we never have done this.

@7c6f434c:

In general. I think C4 implicitly assumes a project with a build/test time under an hour.

CI is an important part of C4 as it allows us to identify wrong patches. There are two ways to deal with correct patches that wont pass CI. First ask contributer to fix. Second if contributer does not respond revert which is like two clicks with github. Out of experience I can say that the second option hardly ever occurs.

In terms of build/test time, sure the faster the better but if it takes longer than an hour it shouldn't be a problem it will however delay your responsiveness. But that is probably no different right now.

Hopefully those comments are useful to you. I'm also happy to answer any further of your questions or concerns!

Cheers, Kevin

7c6f434c commented 8 years ago

@7c6f434c of course there are exceptions i.e. upgrading glibc which is a big clumsy patche and needs to be assessed carefully but you don't upgrade glibc any other day.

The problem with glibc updates is that you can upgrade it with a nice small patch, and many things will still work. Some will randomly break, though.

There are many libraries and tools with a similar problem but with a smaller area of impact.

@7c6f434c :

There will be a shortage of maintainers anyway, as I have advised some people to ask for maintainer rights and they refused to; I have no reason to expect this will diminish.

C4 will allow you to promote people to become maintainers therefore removing the barrier of asking and risking the humiliation by getting publicly turned down.

Erm. You do not need to ask publically, and the people who I have asked to apply for maintainership were always quite a bit above of the cutoff for granting maintainership. Which could be easily verified by the recent history of such requests (I did mention that).

You cannot give someone maintainership without the person accepting it…

In general. I think C4 implicitly assumes a project with a build/test time under an hour.

CI is an important part of C4 as it allows us to identify wrong patches. There are two ways to deal with correct patches that wont pass CI. First ask contributer to fix. Second if contributer does not respond revert which is like two clicks with github. Out of experience I can say that the second option hardly ever occurs.

There is also no correct notion of «correct» patch in NixPkgs: a library upgrade can fix one things and break other ones and this often a value tradeoff. Keeping both versions sometimes causes other problems.

In terms of build/test time, sure the faster the better but if it takes longer than an hour it shouldn't be a problem it will however delay your responsiveness. But that is probably no different right now.

As far as I understand, right there are not enough resources to build all the PRs with the rate they come in without judgement-based aggregation (even if we consider only the throughput).

sjmackenzie commented 8 years ago

So can you add or modify C4 to factor in what a correct nixpkg patch is? i.e. using point form following a similar style to the C4 language.

deepfire commented 8 years ago

In terms of build/test time, sure the faster the better but if it takes longer than an hour it shouldn't be a problem it will however delay your responsiveness. But that is probably no different right now.

As far as I understand, right there are not enough resources to build all the PRs with the rate they come in without judgement-based aggregation (even if we consider only the throughput).

Maybe we should focus on this technical problem separately?

Like making PR testing builds distributed, for example?

zimbatm commented 8 years ago

We could certainly try C4 for a while and then fix what's not working for us. Aside from the specific bullet points I can see a huge value in having a clearly and well defined contribution process. If this gets faster reaction on issues and PRs it's going to snowball into having more contributors.

Another thing that needs to be addressed is the amount of work involved in triaging issues. Certainly the submitter should be able to point to a package or module name. Why is it our job to do that? We have meta.maintainers but they're useless because there is nothing to connect issues to the maintainers. We need a bot that can extract that info from the issue template and @mention the right maintainer.

Profpatsch commented 8 years ago

There will be a shortage of maintainers anyway, as I have advised some people to ask for maintainer rights and they refused to; I have no reason to expect this will diminish.

fyi there are a few people actively asking to be maintainers right now. https://github.com/NixOS/nixpkgs/issues/17379 :P

Profpatsch commented 8 years ago

The real issue is lack of testing coverage. If all combinations were automatically tested, we wouldn't need so much time per PR.

It would be amazing if nix contributors could invoke Hydra inside a PR by comment something like @Hydra, build this + release tests + testXYZ1 + testXYZ2 with Hydra responding with a URL containing the queued builds, the closure of what needs to get rebuilt, and the test results.

I agree that better tooling (shouldn’t hydra integration into Github have been there months ago) would go a long way in making PRs easier. But until that I can’t see how we could do C4 without horribly breaking master a lot of times.

CI is an important part of C4 as it allows us to identify wrong patches. There are two ways to deal with correct patches that wont pass CI. First ask contributer to fix. Second if contributer does not respond revert which is like two clicks with github. Out of experience I can say that the second option hardly ever occurs.

Since we don’t have any CI for PRs at the moment (Travis doesn’t work, nope), this won’t work. But of course it is possible.

wmertens commented 8 years ago

I vote for simply trying it for 3 months and seeing how it went. It could be terrible, but I have good hopes that it will work well, and that our tooling will improve because of the increased need for it.

zimbatm commented 8 years ago

I suspect nothing is going to happen until @edolstra chimes in.

ericsagnes commented 8 years ago

The number of PR increased recently, mainly due to the very close release of 16.09. Of course a very high number of PR get merged, but the total number of PR tends to increase and its very easy for non-trivial PRs to get lost in the flow and rot.

Methods are indeed nice, but maybe there is maybe something to do on the human level to improve the situation until a consensus for the method is found. It is such a waste to see interesting PRs sinking in the abyss of the PR list, and it is also probably quite a motivation loss for the contributor that opened the PRs.

https://github.com/NixOS/nix/pull/329#issuecomment-158507518 is an example of what we should try to avoid at all costs.

Here a few propositions that I hope are not controversial:

To make it short, there could be actions to increase the number of PR reviewers/mergers so it can fits the number of contributions.

zimbatm commented 8 years ago

Add documents explaining how to review PRs and how to become a PR merger.

Good idea. Right now it happens by osmosis and is probably not uniform. Documenting that out would go a long way to improve the consistency of reports and user expectation.

7c6f434c commented 8 years ago

@zimbatm From time to time I run vanity.sh, select people with tens and tens of accepted commits in master, diff this with the NixOS/NixPkgs team list at GitHub and spam people with invitations… This is aimed at improving the uniformity («judging from the project's current traditions, if you ask for commit rights now, you will probably get them»).

But it takes some time to review PR merge history of tens of people (I filter by commit count, then look at PR history). @zimbatm do you want to join and also do the mass mailing from time to time?

kevincox commented 8 years ago

On 26/09/16 17:39, Michael Raskin wrote:

@zimbatm https://github.com/zimbatm From time to time I run |vanity.sh|, select people with tens and tens of accepted commits in master, diff this with the NixOS/NixPkgs team list at GitHub and spam people with invitations… This is aimed at improving the uniformity («judging from the project's current traditions, if you ask for commit rights now, you will probably get them»).

But it takes some time to review PR merge history of tens of people (I filter by commit count, then look at PR history). @zimbatm https://github.com/zimbatm do you want to join and also do the mass mailing from time to time?

Would it be beneficial to have some sort of formal application process? Discovery might be hard but I think it would make reviewing the possible candidates easier. For example I can imagine something like this (numbers can be debate upon):

---------8<---------------------

To become a nixpkgs reviewer you must have the following:

Your expectations as a contributor:

If you fit these three requirements and agree to the responsibilities please file a ticket in the nixos/nixpkgs repository with the title "[reviewer-application] $your_name" and tag @kevincox and [insert two other "approvers" here].

---------8<---------------------

Put it in a file in the nixpkgs root so that it can be modified in a trackable way (and the changes be debated on). And then hopefully you can focus on reviewing the applicants which should have a higher payoff rate then whoever is picked up by the script.

Assuming that enough people are interested in applying themselves this is probably easier then trolling through all contributors to see who would accept the privilege.

Cheers, Kevin

7c6f434c commented 8 years ago

Put it in a file in the nixpkgs root so that it can be modified in a trackable way (and the changes be debated on). And then hopefully you can focus on reviewing the applicants which should have a higher payoff rate then whoever is picked up by the script.

It's complicated. The script picks up almost correctly, and a committer having obviously spent some effort looking through the list can work creates a small expectation of people replying to the email. Which improves conversion rate…

Also, I would say that the threshold to my reachout email is significantly higher than the threshold of getting commit rights by contributor's own initiative. I do mention this fact in the emails.

Assuming that enough people are interested in applying themselves this

Oh well, I do not think that enough people are interested enough to get around to making an application, even if the rules get codified.

Of course it is just my estimate.

groxxda commented 8 years ago

The lack of an application process is the reason I haven't applied yet. Also note that code changes are only one part of contributing. Mailing-list and GitHub activity should also be considered - while the former does not need GitHub privileges, the latter profits from privileges (for example adding labels)

kevincox commented 8 years ago

@7c6f434c

It's complicated. The script picks up almost correctly, and a committer having obviously spent some effort looking through the list can work creates a small expectation of people replying to the email. Which improves conversion rate…

Although the person who accepts due to pity/opportunity is probably less likely to actually do useful work. Where as an application process would theoretically be people who are more apt to actually do the reviewing.

Oh well, I do not think that enough people are interested enough to get around to making an application, even if the rules get codified.

We don't necessarily need "enough" people. Just get a couple of key sign-offs and no major objections. It doesn't have to be perfect as it can always be amended.

TBH I would also like a guide of how I am expected to behave as a NixOS member 😄

@groxxda

The lack of an application process is the reason I haven't applied yet.

Thanks for the feedback.

Also note that code changes are only one part of contributing. Mailing-list and GitHub activity should also be considered.

I think this is true but also harder to define. I would also be very open to recommendations. I see comments and reviews of PRs to be especially useful. In fact they should probably be part of the base "coder" requirements.

Other notes.

I'm not a physiologist and don't know what the best way to get reviewers is but I think it would be very valuable to raise the count. This could honestly be as simple as a "code word" that is commented on a PR when a "reviewer" things that it is ready to be merged. I would be glad to perform that search every couple of days and merge those PRs.

7c6f434c commented 8 years ago

Although the person who accepts due to pity/opportunity is probably less likely to actually do useful work. >Where as an application process would theoretically be people who are more apt to actually do the reviewing.

Well, it still goes to people who actually contribute anyway, so the point is that they can commit more productively. They usually do.

It's not about pity, it is about a single non-reactive action that can cause many reactions, and about some effort from a committer like me convincing people that their chances are high.

re: GitHub labels: there are some projects to get a bot to label issues according to suggestions from a wide set of reviewers.

@domenkozar @rbvermaa I am not ready to do a full list of recommendations for commit access right now (I have a technical one, but I need to review it at some point), but giving @groxxda commit access is a very good idea in my opinion and he has just expressed the wish for that. Could you please grant him committer access?

kevincox commented 8 years ago

@groxxda commit access is a very good idea in my opinion and he has just expressed the wish for that. Could you please grant him committer access?

+:100:

zimbatm commented 8 years ago

@7c6f434c sure but I don't have enough rights to add people if they accept.

@ericsagnes has started a PR to document some of the reviewing practices which is great. The goal is to reflect what we are doing today, not necessarily what's ideal (so we won't get into arguments (for now ;-))).

7c6f434c commented 8 years ago

@7c6f434c sure but I don't have enough rights to add people if they accept.

Neither do I — I just encourage people to answer to me and Rob Vermaas (since recently Domen Kožar can also give access) if they want commit access.

When you have a list of people who have contributed significantly and have expressed the wish, convincing the project admins is the easier part.

zimbatm commented 8 years ago

thanks, it's good to have a list of the admins. shall we take this discussion somewhere else? feel free to write to zimbatm@zimbatm.com so we can coordinate and not send duplicate invitations.

7c6f434c commented 8 years ago

@domenkozar @rbvermaa Would you mind to give @groxxda commit access?

grahamc commented 7 years ago

(triage) what do we think?

We have 1238 open issues and 286 open PRs

We have grown by ~170 issues in the last 6 months We hang around at 286 PRs regularly still (not growing much)