Closed FighterNan closed 1 year ago
The technical-oversight-forum has talked about this issue at length, so if the request is "please talk about this" I think we can say that we already have, and certain proposals have already been tried to make this incrementally better for folks, and if we look at the contributor data, we've made progress on making this better in the last year. We have specifically discussed phosphor-user-manager, and we agree, there's some action that needs to be taken there.
I'm struggling a little bit, because the request above seems like mostly commentary, and isn't proposing something that you would like a decision made on. If you'd like the TOF to make the statement that "things could be better", I think the whole TOF is likely in agreement that someone could certainly improve things, but it's difficult to have the vote you're proposing without a concrete proposal being made. If you have policy changes that you think will make maintainers more likely to respond, please propose them, ideally starting with a review, then if the changes you're proposing are contentious, the TOF is happy to help up/down vote to remove the contention, but as written, I don't see anything requesting an action other than discussion, which anyone is free to start.
Some offhand ideas that I've personally thought about before to try to make this better:
Do you think any of the above would help?
I think the first step is to get consensus on what we think is the maximum latency for feedback across the project. Once we have that in place, it's a small step to accept that (regularly) failing to meet these latency requirements means that there's now a failure of consensus as to who the maintainers are for the repository at hand. One of the roles of the TOF is to handle failed consensus, and I think we can leverage that to define a new process for handling transfer of maintainership.
I think this would be helped by defining different maintenance states for repositories. Linux, for instance, uses the following set of states:
S: *Status*, one of the following:
Supported: Someone is actually paid to look after this.
Maintained: Someone actually looks after it.
Odd Fixes: It has a maintainer but they don't have time to do
much other than throw the odd patch in. See below..
Orphan: No current maintainer [but maybe you could take the
role as you write your new code].
Obsolete: Old code. Something tagged obsolete generally means
it has been replaced by a better system and you
should be using that.
Maybe we don't need all of these. Addressing the issue at hand, a subset as small as { Maintained, Orphan, Obsolete }
is probably enough. With the TOF in the picture, (enough, N
) feedback deadline misses trigger the transfer of maintainership from the current set of maintainers to the TOF. The TOF does not necessarily maintain the code directly, but exists to accept a new set of maintainers for the repository.
stateDiagram-v2
[*] --> Maintained: Create project repository
Maintained --> Orphan: N feedback deadline misses
Maintained --> Obsolete: Project superseded
Orphan --> Maintained: TOF selects new project maintainer(s)
Orphan --> Obsolete: TOF finds no suitable candidates for project maintenance
Obsolete --> [*]
I don't think it's reasonable that one deadline miss triggers this process, life happens. We'd probably want to come up with some number (N
) that people agree substantiates maintainers not performing their role. I also think it's reasonable that if provided with notice of absence (and possibly temporary handover of duties) that the process could be made with more judgement applied than rules. If maintainers are communicating that well then they're unlikely to run into problems anyway. Having more than one maintainer for a project mitigates this problem, for the most part.
Further I suggest this proposal extends to the complete set of maintainers for a repository. Feedback deadlines are only missed if no maintainer for the project provides feedback inside the feedback period. All maintainers are removed in the event of N
deadline misses as it must be the case that they are all unresponsive. This feels intuitive, and provides a way maintainers can continue to sort out who's active and who's not among themselves unless they all become inactive, and therefore are a limiting factor for the project by their combined unresponsiveness.
I expect that the TOF would elect people actively pushing patches to become the new maintainers for the project at hand.
We'd probably want to come up with some number (N) that people agree substantiates maintainers not performing their role.
We may want some ratio and/or misses per time-unit. I know even some of the highly active maintainers miss things or have a misunderstanding on next-steps.
Is this something we'd attempt to pull from Gerrit data or via some sort of reporting system?
All maintainers are removed in the event of N deadline misses as it must be the case that they are all unresponsive. This feels intuitive, and provides a way maintainers can continue to sort out who's active and who's not among themselves unless they all become inactive, and therefore are a limiting factor for the project by their combined unresponsiveness.
A minor contrarian side here is that I think some maintainers effectively have their names on things in emeritus status because remaining maintainer wants to deal with the people-side of pulling their name off.
Would we want to view these deadline misses / maintainer removal per-person? As in, if a person becomes inactive they are removed from maintainership everywhere? I say this because my feeling is If they don't intend to support some code they should be more proactive at indicating this, and my experience says that someone who is unresponsive on 1 repository isn't likely to be responsive on a different one (except in a few cases when the responded-to one is related specifically to something only their company cares about).
We may want some ratio and/or misses per time-unit. I know even some of the highly active maintainers miss things or have a misunderstanding on next-steps.
Yeah, definitely. I tend to practice complaint-driven review :sweat_smile: which would probably fall afoul of all this. I'm happy for people to hassle me in DMs or by @-ing me on Discord on reasonable timeframes (a week or two of no feedback).
Is this something we'd attempt to pull from Gerrit data or via some sort of reporting system?
Just to clarify, I think we should have a documented process such as what I suggested, but that it is really a process of last resort. We should work to exhaust all the usual social/communication avenues first and make sure that maintainers at risk cannot claim that it was a surprise. Contributors should try to work with them in a proactive way to step up and take on maintenance responsibility if they feel there's not enough progress, before we get the point of needing to invoke something like this.
I feel like mining gerrit data should only be resorted to if we want to substantiate a (soft) claim that repositories are being neglected. It should be complaint-driven, effectively, not some cron job that acts as a maintenance cop.
As in, if a person becomes inactive they are removed from maintainership everywhere? I say this because my feeling is If they don't intend to support some code they should be more proactive at indicating this, and my experience says that someone who is unresponsive on 1 repository isn't likely to be responsive on a different one (except in a few cases when the responded-to one is related specifically to something only their company cares about).
I think this is probably a bit agressive. If maintainers are being broadly absent this will fall out in the numbers anyway. Let's not get ahead of ourselves :)
Great discussion! Right, in this issue, I didn't propose any ideas to solve this. I created this TOR issue to make sure that we not only discuss any solution, but also get it implemented at the end.
From the discussion above, I want to highlight things that I believe that will help.
I think this is probably a bit agressive. If maintainers are being broadly absent this will fall out in the numbers anyway. Let's not get ahead of ourselves :)
👍
Any updates or decisions made in the TOF meeting?
Any updates or decisions made in the TOF meeting?
There wasn’t anything actionable yet. I think someone took an action to refine the proposal from @amboar.
Right, I'll push up a doc, but I haven't had time to write it just yet.
I've pushed a proposal here: https://gerrit.openbmc.org/c/openbmc/docs/+/58653
Having thought through the problem, spoken to some people and considered the trade-offs, what I've proposed doesn't take the path of removing unresponsive maintainers like I outlined above. Instead, it just describes a process for on-boarding new maintainers.
I think this can be closed with the action taken being the merge of @amboar's propsal?
First of all, this issue that I created is not to blame any specific maintainers. The examples changes and Discord links that I used are just for reference to start the discussion.
The high level issue that I found is some maintainers don't respond to code review or/and design doc quickly enough due to some reasons: busy with own work, on vacation, etc. I am not clear what the code review SLO is and, whether the maintainers will get warnings or not if they don't serve their responsibilities
If existing maintainers don't respond, it's very difficult for new features to get in. It then also becomes a challenge for people who want to become a new maintainer to convince the forum that they are qualified without enough technical contribution.
The examples that I have are in the phosphor-user-manager repo,
Should the technical-oversight-forum discuss this general issue and come up with a solution or guideline for the OpenBMC community?