Update process for complex features

zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.

https://docs.zephyrproject.org

Apache License 2.0

10.88k stars 6.63k forks source link

Update process for complex features #50836

Closed yperess closed 1 year ago

yperess commented 2 years ago

Introduction

The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for:

44098 new sensor API
45370 new sensor subsystem
45601 USBC subsystem

I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the main branch.

Problem description

Some PRs (especially for new subsystems) get too big to review in order to meet the quality requested by Zephyr. This includes:

documentation
API headers
Implementation
Tests
Samples

By the time a PR includes all of these, it usually involves so many changes, that the review becomes a multi-discussion-threaded monster that drags on for months and ends up with poor velocity for the contributors. One of the concerns is that until the new logic is developer read with the above requirements, it is difficult to communicate to developers that it is still a WIP and will likely change.

Proposed change

Experiments will be done in a feature/topic branch
Add a new entry to the west.yaml manifest which will allow projects to enable experiments. When enabled, west update will cherry-pick the corresponding branch on-top of Zephyr
New experiments will take the following life-cycle:
1. The feature will be proposed to and approved by the TSC
2. A new feature branch will be created (call it feature-X for this example)
3. Developers can now enable the experimental feature by adding the following to their west manifest:
```
experiments:
  - name: X
    revision: HEAD/STABLE/SHA
```
4. Making sure that the experimental branch is rebased on main is the responsibility of the experiment developers
5. Small (possibly broken) commits can be added to the topic branch via easy to review PRs
6. The topic branch should contain the label "STABLE" if the developers so choose. This can be used to signal to other developers that going past this commit is breaking (doesn't build or functionally doesn't work). This label will allow HEAD to be in any state if needed during experimentation.
7. Once the topic branch is ready (meets all the criteria in the section above). One PR can be made to pull the topic branch into main.
8. When approved, this PR should be merged with a squash merge to avoid polluting the commit history with potentially breaking commits that cannot be built.
9. When merged, the topic branch should be locks to preserve the commit history.

yperess commented 2 years ago

@sambhurst @keith-zephyr

stephanosio commented 2 years ago

cc @mbolivar-nordic @carlescufi @nashif @MaureenHelm @galak

aaronemassey commented 2 years ago

Yes please. This would help considerably in doing incremental development on newer APIs, such as a battery API, versus having to make a monolithic PR that has tests, a driver, the API, a sample board, etc.

@teburd What are your thoughts on this?

teburd commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

carlescufi commented 2 years ago

@yperess maybe you could bring this up in the TSC?

yperess commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.

@carlescufi I'll be there at the TSC next week

keith-zephyr commented 2 years ago

@yperess maybe you could bring this up in the TSC?

I think the Process WG is a better place to start the discussion on this topic than the full TSC.

mbolivar-nordic commented 2 years ago

@yperess if you can "champion" the issue within the process WG, I am happy to add it to next week's agenda.

mbolivar-nordic commented 2 years ago

Initial thoughts:

Jumping into the weeds, I don't follow the semantics of the proposed experiments key for west.yml files. Where does this go (next to projects and self? Inside self? etc.). And then the west update semantics look weird too. It's also not clear how you want west to cherry pick branches on top of zephyr itself for feature branches in the zephyr repository. By design, west avoids modifications of the manifest repository itself.

Zooming back out given that, I think we should try to settle on the workflow you want to enable before figuring out if we want new west features and what they are. I'm not clear on what having two experiments enabled at once looks like, for example, especially if they both touch zephyr and zephyr is the manifest repository.

teburd commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.

Can the project really manage feature branches though in one repository like this? It's been tried before in another form and didn't really work out.

I guess my question here would be if the goal is to build a collaborative place to work on code, e.g. a new sensor API, perhaps the linux model of a fork where collaborators can review and merge each others code, then polish it up on a branch and PR that to the upstream project.

Any sort of rebasing/cherry-picking/merging work can be maintained by the authors then with normal git usages. I'm not sure how west doing some cherry picking would work out. This seems like something the authors of a feature should be dealing with, rebasing as they go.

yperess commented 2 years ago

Blank diagram (2)

Here's a rough diagram of the flow I'm expecting. Having the experiments in the west yaml will trigger the following when calling west update:

git fetch
git checkout origin/main
git pull --ff origin/sensors_v2
git pull --ff origin/sensor_subsys

This enables the team working on this to be able to keep the work in upstream Zephyr (Zephyr owns the individual commits still). This way history is not lost. When we make the PR to merge the feature branch into Zephyr, that merge can be done as a squash merge. This makes sure that nobody can checkout a bad commit in the feature branch through main.

@mbolivar-nordic I blocked off the time for the process WG. I'll be there.

pdunaj commented 2 years ago

The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for: ... I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the main branch.

I am not sure this is the only problem. You can have small PR that drags for months or complete solution that is blocked by one person because of "I don't like it". It's fine if somebody does not have better things to do, but if you develop product and have deadlines you will not wait months for code to get in. And after you have code in private repo you lose incentive to share it here, especially when you expect a long and tiering fight.

Personally, I would:

Allow multiple similar solutions to coexist in Zephyr, hence allow user to choose what's best for them
Create staging tree for things that are not "complete", but people are willing to iteratively develop them

So, if something is widely accepted it goes to mainline. If something is questioned, but is not garbage, it goes into staging to allow it to become mature.

I personally avoid pushing anything here because it takes ridiculous amount of effort to be done :(

yperess commented 2 years ago

For the meeting today I want to highlight the focal points that this is trying to solve (for us):

The review process takes too long and it's difficult to show velocity.
Once the review started, developers generally want to keep working. Long reviews means that if I add more commits ontop of my work, I'll be adding them to the same PR (since GitHub does per branch review and not per commit). This just complicates things further (Gerrit solves this by doing a per commit review).
PRs don't retain good history. If I get actionable feedback, I'll amend a commit. Meaning I have to force push. At this point, a lot of comments are no longer relevant or can't be mapped to a specific line of code. If we want to keep that history, we would need to add commits to the PR to address the comments, but that also means that it's likely that some commits will be "broken" and the PR should be merged with a squash.
After a force push, I often have to re-review the whole change instead of just the delta because of the issue above

While I just flat out dislike GitHub, I don't see us moving away from it. I believe that feature branches with experiments will at least alleviate the above pain points.

mbolivar-nordic commented 2 years ago

Process WG:

@yperess Idea is we start by getting a feature branch authorized by some group, e.g. the TSC. Then we collaborate on that branch, which may be in an unstable state. Then when that feature stabilizes, the people involved will create a pull request to main. At that point, we have the same requirements as today (fully tested, documented, with samples, etc.). Idea is to increase velocity and visibility into the process. Then the final branch would be squash-merged into main.
@yperess the west idea is not a big requirement for me, just a nice-to-have. Idea would be to combine multiple branches.
@stephanosio we already have topic branches; how is this different?
@yperess when I spoke with Tom about getting a topic branch for sensor work, I was told we didn't do that anymore.
@nashif we did use them in the past, but not recently.
@carlescufi we used them for usb, BTLE LLCPv2, a few other things
@yperess was it OK to submit partial pull requests to those?
@carlescufi yeah
@nashif my problem with topic branches is that we should still establish that the merge proposal to main will be a fully independent pull request that can still be reviewed and rejected
@keith-zephyr what was the process in past topic branches?
@carlescufi there was no full process
@galak but I agree with @nashif that the merge to main should take the form of a 'real' PR
@teburd we've tried this before. At least for sensors it didn't really garner any more support or publicity. Having a fork still allows people to collaborate. I'm not clear on what the benefits of a fork versus a topic branch are.
@nashif collaboration -- working on a small team, moving fast, etc. Logistical questions about who merges, CI, etc. can be worked out. Someone can volunteer to be the merger/owner of the branch. We could figure out how much CI we want, but it is beneficial to have CI working as well. That's the benefit as opposed to having a random fork.
@yperess the main benefit for me is the ability to keep the dirty commit history. If I worked for months on something iteratively, it's a nightmare to keep rebasing and squashing. You also lose a lot of information by doing that. The main thing is to have the topic branch in zephyr, with CI, incremental commits preserved. When the final pull request is created, full scrutiny. If changes are requested, we add them to the topic branch's dirty history, squash commit the entire branch into main, and keep the entire review history.
@stephanosio why does it have to be a squash? you can always clean up the history and submit it as a separate PR?
@yperess that's way too much work, and e.g. if someone has an objection, you can point back the original dirty history where the same question was resolved and addressed.
@stephanosio if you periodically sync up into upstream PRs, it won't be an issue
@yperess and if you look at my examples, there real submissions where the PR got blocked until the entire feature was fully flushed out. this is fine with me if we have a topic branch
@galak when you say a squash merge, what are you describing?
@yperess you take the entire diff between main and the branch and squash it into a single commit that gets committed onto main. The resulting commit will point back at the pull request.
@galak so history is there in the sense you can go to the PR, not in git log?
@yperess you can look at the commit in main, find the pull request, from there look at what branch created it, and there you can see the git log of the commits that led to it in the topic branch?
@stephanosio and if we move to gitlab or something else?
@yperess if you clone with branches, you'd still have it
@dkalowsk as an aside, if you move to gitlab, you'd get revisions of the pull request
@galak there are a few things that concern me. One is that we haven't had the process to maintain topic branches forever. We'd need to resolve that. The other is that to @teburd 's idea about doing this somewhere else, you can keep the branch in the zephyr project organization, where it won't get deleted. But I don't see how a squash merge gives you logical commits in the main repo. You'll end up with one commit for a huge change, as opposed to maybe what should be a set of logical commits. To me, trying to understand history, I don't want to go through the dirty history, I want to see logical changes, not all the development back-and-forth. If I am trying to debug and I see a massive commit that changes 4000 files or something, this is a problem.
@yperess can I pose a question here? Currently when we post a pull request, we don't have any guarantee the individual commits are bisectable.
@gregshue we seem to be separating the 'why' from the 'how' of how we got to some set of changes. It seems to me we need to figure out the feature, the end goal, and create maybe a PR that evolves in a straightforward way.
@yperess this is the workflow we have been following. You get to the end solution, you break it up, and now someone says "did you try this, i don't like it" and you can't point someone to the commit where you figured out why it didn't work and left a log of why.
@dleach02 you're highlighting a separate problem with github's tooling though.
@nashif I see a major issue here. This really has nothing to do with the traditional topic branches we had. This would need to be applied to each pull request.
@yperess no, just for sufficiently complicated features. First step would be to get approval that there is a sufficiently complicated task to create a topic branch.
@nashif let me give you my perspective -- I have different versions in my local branches where I develop. I don't think it makes sense to maintain this just to be able to prove to someone that something was worth doing.
@yperess for me, it's difficult to show progress internally without something moving forward on a topic branch. We all have our own OKRs and being unable to show progress is a challenge.
@carlescufi I completely agree with the premise that long-form collaboration is not supported in zephyr unless you set up a fork. I gather that bringing back topic branches is a possible solution to that. It is also true that there is some value to having it in the main repo. There is some authoritative power to that, and for that we need a set of rules. The most important decision to take is whether the commits you end up putting in the topic ranch is what we have in the main branch. I guess everybody agrees the answer is 'no' and we want people to be able to rewrite history when merging into main. All in all, what I want to say is that beyond deciding what we have in these branches, I do agree with the fundamental idea we need to support this use case better. If we all agree that's the way to go -- for temporary topic branches, not the arm branch we used to have -- we should write down the rules on how they work and allow people to create them, designate maintainers, etc.
[consensus on the notion that topic branches for collaborating and sharing CI resources is worthwhile, need to continue discussion on logistics of merging to main, etc.]

yperess commented 2 years ago

Update

I would like to propose that we don't try to get this perfect on the first go. Lets come up with some guidelines that we think might work and have a test run at the process with a retrospective and some checkpoints along the way. I'd be happy to use the sensors v2 work as that case study. My current thinking is:

Get a topic branch created
Merge commits that pass CI but may break features
When the topic branch contains a fully implemented feature, at least 1 sample, tests, and documentation create a PR into main
As comments come in on the PR add commits to the topic branch (no force pushing will be allowed)
When the PR is ready, we can discuss how it will be merged. My gut feeling is that it'll be too difficult to refactor in a reasonable time frame and we'd end up squash merging the PR then locking the topic branch (but this can be hashed out later).

I believe that what we'll see is that it's much easier to re-review the PR without force pushing since you can just look at the new commits since your comment and see how things changed since you asked for changes. I believe that this will out-weigh the downsides of the squash merge (but again, we can figure that out when the PR is ready to be merged).

mbolivar-nordic commented 2 years ago

Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.

@MaureenHelm: should move towards incremental development of breaking large changes into smaller PRs

fabiobaltieri commented 2 years ago

Is there a real difference between a proper upstream topic branch and how PRs are implemented in GitHub? If anything having a branch that hosts "dirty" changes and have known broken intermediate states feels like it may make reviewing the whole change even harder and increase the potential of submitting bad code. Imagine this situation: you do the work broken into coherent patches to represent development history and development phases, then you keep doing incremental modifications and change everything in small bits for the sake of faster iterations. In the meantime you still have to periodically sync up against main, otherwise you can't pull back. The resulting branch is now impossible to review to see if the change is coherent, the only thing one could do is review a diff against main, which is not really something you can leave comment on anymore.

I think that this should be tackled differently: the problem is that some PRs are too large, they include a mix of features, small changes, refactoring, documentation samples etc... How about this: when we work on a PR like that, the author and reviewers work together to split it into multiple PRs and merge it a bit at a time -- this would be the kernel equivalent of sending a series on a mailing list, and have the maintainer accepting some of the patches and asking for changes in other. That way we would have a proper progression on the change and reduce the size of the problem a bit at a time, which should make the development cycle faster.

gmarull commented 2 years ago

Another idea here:

Nowadays, one can develop everything in a module: custom boards, APIs, bindings, tests, etc. An experimental feature could start as a module (flagged as experimental, not pulled by default) and once mature enough, be integrated in the main branch. Or even stay as a module if preferred. I think that this would allow making progress much faster, even have 2 competing implementations people can play with, etc. We could even have an entire platform support in a module: soc/drivers/dts, etc.

keith-zephyr commented 2 years ago

Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.

@MaureenHelm: should move towards incremental development of breaking large changes into smaller PRs

I'm in favor of pivoting the discussion in this direction. Regardless of whether we employ topic branches, smaller more incremental PRs will make reviewing easier.

Some thoughts on this:

What's the minimum requirement for a new API? API lifecycle requires documentation, one implementation, and one sample. Tests are not mentioned.
Can we expand use of experimental tags to apply to all aspects of a feature/driver/subsystem - APIs, Kconfig, and devicetrees? @galak specifically mentioned devicetrees lack the experimental tag.
We should encourage MVP solutions for initial PRs. Reviewers often make requests that expand scope causing the PR size to grow.

mbolivar-nordic commented 2 years ago

Process WG:

@MaureenHelm we're on the same page here. Maybe we should look into what our existing docs say
@keith-zephyr existing API lifecycle docs don't mention exit criteria for each stage, but they do mention the requirements. A new API is the closest thing to what these large PRs are (new subsystem, new feature, etc.).
@nashif one misconception about APIs is that they apply to peripherals, but it also applies to everything - subsystems, kernel, etc. We need the same rules I think. Documentation, code, testing -- if not mentioned, that needs to be there. I think this is clear and I would say "obvious", but looking at these items mentioned, there's always creeping requests. We need to define besides docs/tests, what is a minimum viable implementation.
@mbolivar-nordic can we lean on maintainers to provide MVP requirements?
@nashif yes, but then for new subsystems, we need some more general rules. E.g. for I3C or the battery API. For these we don't have a maintainer. We as a group as the architecture forum need to discuss them. Need general guidelines that we can point to for new areas for the project.
@MaureenHelm I agree, but I think that's going off on a tangent. In the original issue, @yperess is adding new sensor API, it's not that we don't have a maintainer. I don't think the problem in this case is a lack of maintainership.
@fabiobaltieri I wanted to say the same; the issue under discussion was getting faster iteration on the sensors v2 API. The pull request was getting pretty big and there was discussion of splitting it up; I think that's the right thing to do. Going back to USBC, I think this should have been split up into multiple PRs. One of the things @yperess mentioned was a lack of sense of progress. -@galak I think there should be guidance that if there are commits that are acceptable, to pull them out and submit them. Whether that's in the case of cleanup changes for USBC, those should have gone in separately so other things that are being discussed don't get murky.
@mbolivar-nordic is there anything we need to do as process WG?
@MaureenHelm unless there's some additional documentation
@keith-zephyr or other reviewers who expand scope
@mbolivar-nordic given what we have in documentation, what are we missing? We should rely on maintainers for clarity on what is overkill and what isn't.
@keith-zephyr I don't think the discussion is about tooling. The tooling makes some things worse, but we need to define a process that encourages smaller PRs.
@stephanosio could use PR templates that encourages splitting large PRs up into smaller oens
@galak you talk about in the case of something being new (no maintainer yet), and something existing (maintainer exists). The maintainer exists case is not that difficult. The other case (usbc) is harder. I know I held it up, and then thought about maybe letting things in as experimental, etc. I think about separate concerns reviewing API interfaces and implementations. Don't think we need to hold up internals if the API is good. I know we should probably also be more cognizant about "experimentalness".
@gregshue one thing I haven't been hearing is where we've been tracking the larger direction of architecture that will result of a collection of small PRs. I don't think we can deal with smaller PRs without having that larger picture of where things are going. Where does that planning and alignment on the planning and architecture of a subsystem going on?
@stephanosio there is usually an RFC
@nashif RFCs, the architecture WG, PRs, etc
@carlescufi I know where you're coming from, but we cannot mimic the structure of closed source within open source
@MaureenHelm RFCs should cover big pictures. I've seen many cases where people submit PRs and reviewers -1 if there isn't a reference to an RFC with a larger scope.
@keith-zephyr I'm looking for where our recommendations are
@mbolivar-nordic so are you asking for a "submitting large changes" section in the contribution guide? There's not a lot of consistency and PRs keep getting larger to accommodate reviewers coming in.
@keith-zephyr we all want small changes so in practice that's not what we're getting
@galak to @keith-zephyr 's point, probably adding to these sections to expand on what we want is a good idea. One is having reviewers/maintainers ask for changes that are in a PR that are orthogonal to be split up.
@nashif let's talk about new subsystems. We agree we won't just accept a header file and wait for other PRs with documentation and tests. That's probably not going to happen. This is about PRs which have orthogonal pieces. But I think what we need to get and what will solve that problem is thinking about when something is new, what is the list of requirements you have to provide (header, implementation, documentation, tests). And then further, for things like I/O interfaces, how much do I need for an initial implementation? That still requires docs and tests but can be built on top of. Another thing we need to do is learn how to say "no". Be able to say "no", this is not something we want to do as part of the project, but we are not going to do it, or do it this way.
@mbolivar-nordic I wasn't aware of the "proposals and RFCs" section (https://docs.zephyrproject.org/latest/project/proposals.html#proposals-and-rfcs). I wasn't aware of this, maybe we need to move it into the contribution guidelines. Maybe we should take some time to review this and the existing contribution guidelines, and try to come up with what we would like to do.
@galak that and the API documentation pages

galak commented 2 years ago

A few comments:

Document that we should ask PRs that are associated with a RFC always put a reference to that RFC when being submitted
Document that if a PR is taking a while to merge, that reviewers/maintainers request submitter to pull out commits that are concerned acceptable into a separate PR to reduce amount of re-review, making progress, etc.
For new APIs we should request that the API is limited to what is needed and can be demonstrated in a sample or test.

fabiobaltieri commented 2 years ago

For new subsystem or features, should we have a process to break down the feature into areas and manually assign a set of approvers to decide when the initial implementation is "good enough" for merging? For the usbc case for example it could have made sense to explicitly nominate someone to approve for usb, devicetree, device implementation (stm32 in that case), that way it would be clear who is driving the review. One active case is the zbus PR (https://github.com/zephyrproject-rtos/zephyr/pull/48509), few people contributed a review and approved at some stage, but there's no clear set of reviewers that would have to approve for the initial implementation to go, so everyone is waiting for the next round of review to be applied, for the latest reviewer to approve, then someone does another pass and the cycles continue.

Maybe a process could be that as part of the RFC we identify the affected areas for the change, and then when the RFC gets discussed in the Architecture WG we assign someone to every area (a mainainer or collaborator) that commits to track the PR, drive the review and approve it, trying to keep iterations as quick as possible.

keith-zephyr commented 2 years ago

Looking at the documentation again, there appears to be 3 areas of the documentation that intersect with "large PRs"

The Contribution Guidelines is massive - 4500 words and needs some cleanup. The Other Commit Expectations has some of the clearest guidance, but needs some expansion:

Small PRs (this is mentioned under the workflow section currently).
Minimize rebases until the patch is approved, or perform rebases separately from making changes to the PR.
We could define a soft-cap for the number of lines in a PR as a signal for when to create an RFC (and then break up the PR if possible).
As @galak mentioned - encourage authors to pull independent commits that have been approved into a separate PR.
Add test coverage requirements for all PRs? It's currently mentioned only for new APIs.

The Commit Expectations could possibly get promoted up a level so it's not lost among the large Contribution Guidelines. Something like the Chromium Firmware Guidelines could be a model. Google's Small CLs doc also has good content.

The Proposal and RFCs document should encourage the author to define a minimum viable implementation. Once the minimum implementation merges, it makes collaboration easier. I like the proposal from @fabiobaltieri of using the RFC to designate assignees for various blocks.

The API lifecycle is the only place I see that talks about providing documentation, one implementation, and one sample (but omits mentioning tests). We could also add these requirements directly to the Commit Expectations, and just point the API and RFC docs back to the Commit Expectations.

mbolivar-nordic commented 1 year ago

Process WG:

Summary: general consensus on @keith-zephyr and @fabiobaltieri 's proposals, @keith-zephyr to start work on updating the docs

@keith-zephyr there are some details in the comment, but the main thing I'm advocating for is creating a separate document for our contribution guidelines and best practices. The contribution guidelines document right now is really big, and it's hard to find the information on what a PR should have and what the best practices are. My proposal would be to break out the PR related areas into something separate, and we can link back to it from other areas such as "API lifecycle" and other docs
@fabiobaltieri my comment is that in many cases it's unclear who is expected to follow up on PRs. I was reviewing the zbus PR and the battery one. Maybe as part of the process, we should identify the main areas impacted and identify the person who is going to stay on top of the PR and merge it. We also have a clear signal to the author who is the main point of contact. Maybe it would help with drive-by reviews. What we should avoid, and what happened with zbus, is that I think many people were tracking the PR waiting for it to get quiet before coming in and reviewing. Every time a new person joined in, there were weeks of silence.
@mbolivar-nordic I want to call out maintainer burnout -- that's a good idea for release roadmap items, but I think it's not practical for every single PR
@keith-zephyr part of what we can get out of "commit expectations" is "what qualifies something to be large enough to require an RFC". Requiring an RFC could trigger @fabiobaltieri 's proposal
@mbolivar-nordic so I think there's no disagreement on @keith-zephyr 's proposal for a documentation change especially for large PRs, and @fabiobaltieri 's proposal for picking the 'main' reviewers for PRs for roadmap items
@keith-zephyr I volunteer to start work on that
@galak it feels like there's a process gap with "how to do something new and what is the path to go through"
@keith-zephyr in terms of something new, I think we want to push the RFC as the process we want people to go through, and our language should move towards "try to land smaller PRs", "try to define your minimum deliverable", but the RFC should provide that roadmap
@galak that makes sense to me, but I just want to make sure there is an "RFC lifecycle" that's addressed as part of that. One example, e.g. driver model changes, have started to make progress, but I don't know that we have clarity on how we make decisions.
@mbolivar-nordic I see the target milestones as being the signal for this.
@galak I want to make sure we are covering the "we haven't reached consensus" path better, though
@keith-zephyr there are 2 buckets: 1. enhancement for something that exists and has maintainer. Maintainer should resolve and drive consensus. 2. no maintainer: this has to go through arch WG to get consensus.
@galak that all makes sense to me; formalizing the arch WG's responsibility is what has to happen though. It doesn't seem right to have a process where "the 2 people who were at arch WG agreed" is a good one.
@nashif I agree; we need formal minutes and decision tracking with attendee lists for the record and people who cannot attend meetings
@galak Going back to @keith-zephyr 's point about buckets, maybe there's more discussion to be had about bucket 2, with examples like "i3c", things that have no maintainer because it's something new, but we could form a group of people that make the architecture body for evaluating new things in specific areas, like new device APIs, etc.

marc-hb commented 1 year ago

PRs don't retain good history. If I get actionable feedback, I'll amend a commit. Meaning I have to force push. At this point, a lot of comments are no longer relevant or can't be mapped to a specific line of code. If we want to keep that history, we would need to add commits to the PR to address the comments, but that also means that it's likely that some commits will be "broken" and the PR should be merged with a squash. After a force push, I often have to re-review the whole change instead of just the delta because of the issue above While I just flat out dislike GitHub, I don't see us moving away from it.

For the record Github's review model and limitations were discussed one year ago in

39194

we've decided not to replace GitHub, but there is still an open question of how to improve the review process in general.

mbolivar-nordic commented 1 year ago

Process WG: work is ongoing, revisit next week

mbolivar-nordic commented 1 year ago

Process WG:

@keith-zephyr Hope to have a draft PR uploaded later today

mbolivar-nordic commented 1 year ago

Process WG:

Consensus on testing requirements: all API functions should have test cases, and there should be tests for behavior contracts, along with examples for where it's been done well. Leave it to reviewer discretion for now, knowing that a better solution is desirable, but this is our starting point.

@keith-zephyr: 2 main topics to discuss today are review timeline expectations, test coverage. Will be updating PR to cover the existing noncontroversial points.
@keith-zephyr start with test coverage. The existing codecov tool is kind of poor. Coverage tool shows 95% coverage, but in the details, it's only measuring 39 lines of code out of the header, which is the lines that the i2c emulator happens to call.
@nashif this goes back to the fact that we are using native_posix with codecov. If a feature is not testable with that board, you won't get coverage results. QEMUs are supported as well; I think currently x86 and arm; others should be able to be supported as well. But the way you implement APIs there will be totally different for a real application. Several PRs and issues opened over this.
@keith-zephyr in terms of I2C API on native_posix as an example, coverage is determined by capabilities of the emulator. So we could increase the scope of the emulator and get better coverage numbers.
@nashif why are we getting such low coverage?
@keith-zephyr in this case, the emulator only implements a subset of the driver APIs
@nashif expectation without enforcement or CI is that code has tests. Obviously some people catch this in review and some don't. That obviously won't work -- no matter how much time you spend, you won't be better than a tool. So we need a way to enforce this. But not everything is testable in CI using simulations, as some features require special hardware to be exercised. In terms of the expectation, maybe we say it's preferable if we have testing in simulation/emulation. I know you at Google are doing a great job of this. This is something we need to make people aware of and start pushing for, to ensure coverage is meeting a standard.
@galak what's written is in conflict to some degree with our statements about API levels. E.g. experimental APIs don't need testing
@keith-zephyr lifecycle docs say even experimental APIs need tests
@keith-zephyr we have tooling issues, but what do we want to communicate in terms of expectations?
@fabiobaltieri it doesn't make sense to have a testing requirement if the tooling is not there, right?
@dleach02 it doesn't feel wrong to have the requirement if we have a way of reporting gaps and actively work towards closing them
@galak looking at https://docs.zephyrproject.org/3.2.0/develop/api/api_lifecycle.html, I only see 100% for stable APIs. Statement to date has been that tests aren't needed for new experimental APIs
@nashif I think that's something we need to fix
@keith-zephyr I would strongly agree
@nashif I think @keith-zephyr 's suggested guideline of 80% for less-than-stable APIs is what we need as a goal
@galak can we phrase this more as a test for every API as opposed to a coverage metric? Might not have everything covered, but at least have something invoking every API.
@mbolivar-nordic how can we make sure this is done? E.g. devicetree.h has testing for every API, but what about all the other 'real hardware' cases? How can we track this and move towards resolution?
@galak if we start with API coverage, we can start with an inspection/code-review effort before getting into questions about coverage and automation. We can start enforcing this in review
@stephanosio I think there's a lot of ambiguity there. I think what we really need is some kind of mocking layer that can act as a device that can be tested.
@galak but even before that, just as a first order developer, e.g. @dcpleung for I3C. We had to do API and hardware support for that, so there's things he couldn't test on that hardware, but I'd like to see tests he can run on that platform and also at least implement tests for other API expectations, so someone who can run those on that platform can run those tests that weren't testable beforehand. I agree we should be aiming for that, but that requires a high level of engagement, and it doesn't make sense to require that of anybody as a project unless someone is going to do that.
@nashif peripherals are usually more complicated because you need devices before you can create a mock, but as @galak says -- we need tests and we need drivers. In terms of measuring coverage, that's going to be device specific. We need to find a way to emulate or simulate the hardware, and if that's not possible, it's going to be by inspection.
@stephanosio we are trying to make some sort of requirement here. If so, we should make a difference between logical APIs like zbus and driver APIs. Probably should define different standards for those 2.
@nashif even saying 80% is not meaningful since there are many types of 'coverage'. I want to see that a person proposing a new API has made an effort to test it. We can go from there with better metrics as a stretch goal. But the idea is that if there are 5K SLOC and 50 SLOC of test code, something is wrong here. So I should, as a reviewer, be able to tell someone "I don't see enough tests here, we have these guidelines; show me you are doing more to test this".
@teburd to add to that -- the example here is I2C. Most of the I2C header functions are just wrapper functions. It's hard to understand the value of testing that, versus testing the driver itself / behavior expectations. Doesn't verify that I2C driver behavior is uniform across the tree, which to me seems like a better goal.
@keith-zephyr one thing we would have is: we do have an I2C controller emulator. If it implements the full API set, then it can serve as a reference
@nashif that's a good point -- terminology wise, you can't test an API on its own; you are testing the implementation
@galak I think the first order of test is defining the API contract. E.g. for I2C you have i2c_write(), but then what happens in an error condition? There's a contract there that I've been thinking about from a test perspective. I think the only way to verify that now is inspection
@stephanosio documentation means nothing if implementation doesn't comply
@nashif drivers are a special case, since we provide an API and expect implementations. Subsystems are a different case since there is an implementation along with the API -- it is the actual thing you have to test.
@stephanosio there are other factors: thread safety, callable or not from ISR context, etc. We can't test those in emulation and they need to be tested as well.
@nashif maybe to rephrase what we have now, we can say "you need to provide tests, and your implementation needs to be testable and tested, ideally in CI". It shouldn't be about running a coverage tool and hitting 80% or some other threshold.
@galak do we have a good example that can serve as a model?
@nashif we have a few -- RTIO, zbus, SMF, the networking subsystem
@keith-zephyr those are all subsystem/library APIs. Any examples for drivers?
@teburd sys timer; I just did a lot of work on this where I found a lot of gaps in terms of expectation vs what was tested. reference: https://github.com/zephyrproject-rtos/zephyr/tree/main/tests/kernel/timer/timer_behavior
@stephanosio verification is a whole separate topic from implementation; I don't think it's reasonable to expect formal verification and requirements tracing for typical APIs
@fabiobaltieri we go to line coverage because it's easy to measure, but I can also think of examples where it's not worth getting to 100%
@mbolivar-nordic what do we want to put into the doc as a starting point?
@keith-zephyr @galak's suggestion is probably the easiest to implement, but it doesn't verify you're testing required behavior, and that's where it starts to get harder
@fabiobaltieri I like @galak 's idea because it's at least a starting point where people can come in and add to
@dleach02 problem with that is that the behavior is really important. A goal of the driver model is that you can write your applications in hardware-agnostic manners, and vendor behavior tracks across SoCs within the same API.
@galak I think we could write something in the committer expectations document that tests should cover not just basic functionality, but the behavioral contract. I don't think we need to get specific other than by providing examples of where it's been done well for drivers, subsystems, etc.
@dleach02 subsystems are more self-contained if they don't have a hardware component; bigger concern is around the drivers
@nashif that's a big topic; we do need to be able to mock and run unit tests without flashing. E.g. if an API changes behavior, I need to be able to catch it in a test instead of just ship it and wait for someone to hit it. We should be able to provide this for APIs, but verification is another level of testing that requires hardware.
@dleach02 I also don't think you should burden the Zephyr project with validating that -- expectation is that vendors are doing that.
@mbolivar-nordic any objections to putting something in @keith-zephyr 's PR as a starting point?
(no objections)
@keith-zephyr I think @galak 's point about every functions having a test and there should be tests for behavior contracts, along with examples for where it's been done well, that will be useful for authors and reviewers
Done with this for now; continue discussing maintainer bandwidth next week

mbolivar-nordic commented 1 year ago

Process WG:

@keith-zephyr would like to discuss the concept of 'prompt review' in the committer expectations PR https://github.com/zephyrproject-rtos/zephyr/pull/52731/. @fabiobaltieri had an idea of having the PR author escalate if there isn't prompt review (https://github.com/zephyrproject-rtos/zephyr/pull/52731#discussion_r1043269231)
@mbolivar-nordic so the proposal is to ask again in 2 weeks?
@dleach02 problem is how to force someone to have time
@fabiobaltieri comment is about hearing back from the maintainer -- if your PR is too big, you would expect someone to say 'split it up' within one or two weeks. Otherwise have a paragraph about how to escalate.
@keith-zephyr letting a PR soak for 1 or 2 weeks seems reasonable for an RFC etc. Once there's been discussion on it and agree that the PR is good, we don't want to encourage the behavior that waiting a week to address minor comments is acceptable.
@dleach02 you mean once a PR is engaged and communication is ongoing, you don't expect a 2 week delay between updates?
@keith-zephyr what I want to avoid is establishing an expectation of a 2 week turnaround on each update to a PR
@mbolivar-nordic I want to call out that there are review bottlenecks within the project -- @galak on DT and @tejlmand on build system, where you need to wait for them to make time to clear their NAK
@dleach02 I tell my people that a block should be addressed promptly; this is a pet peeve of mine
@mbolivar-nordic we can't expect maintainers to be teaching people how to do things
@dleach02 it's good that review happened and something is wrong, but if the concern is addressed promptly and the maintainer won't address
@galak there are 2 situations with this though -- there's too much noise in my emails to track what I want to focus on which can slow things down. 1st situation is where I just made a comment and my -1 can be removed. 2nd situation is where I am the maintainer and I get busy, and that just needs to wait
@dleach02 that's a good clarification -- if there is another maintainer, they can remove the block
@keith-zephyr do we want to have any language on getting people to avoid ghosting a PR: where a person leaves a blocking comment, the person addresses it, and you haven't gone back and re-reviewed. The maintainer can clear blocks but it's healthier if the maintainer doesn't have to
@galak I'll pick on myself because I'm guilty of this behavior, but I know why it is: the notification system is not set up for my workflow. If I could just see the things that I commented on I would be able to comment more. But the flip side is you're pushing me not to comment on things since I have to be coming in and out when I'm busy with other things.
@keith-zephyr separate from this -- I did add some language that if you don't have the expertise or bandwidth to review, you could remove yourself, but @mbolivar-nordic pointed out that zephyr-bot would just add you back in. If we had a better bot, we could have a better set of notifications
@stephanosio that was brought up a few times before -- removing yourself will result in zephyr-bot adding you back in. The reason is that github doesn't have an API for checking if someone has removed themselves as a reviewer. The only way to implement that would be GH comments to instruct the bot not to add you as a reviewer, but that's messy since the message history will get cluttered. It's a messy topic; I hope GH eventually provides a mechanism for this but for now there is no straightforward solution
@carlescufi there are ways to check for ways to find places where you've NAKed but there is a force push. The problem is it's for the fish shell, but the point is he uses that to find PRs that are waiting for him. Maybe we could make that available in some easier to use form.
@stephanosio so you can see the history of who was added and removed? zephyr-bot currently just adds people back when there is a MAINTAINERS entry for you. that's the problem that's blocking us.
@carlescufi my point is that I'm looking for places where I have reviewed, the author has pushed, and they are waiting for me. That's one of the most frustrating bits for new reviewers (other than silence). Torsten has a script which lets you find this information.
@galak 1000% make that script available
@carlescufi someone will need to port it from fish
@mbolivar-nordic that's good, but then let's say you have over 10 of these every day. I remain leery of asking for reviews on any fixed timeline absent a prioritization mechanism for PRs.
@galak I think documenting the escalation paths would be good there as well.
@mbolivar-nordic I agree with that -- and maybe getting more escalation feedback (use dev-review and #pr-help as escalation)
@keith-zephyr I do agree we need to document that reviewers and maintainers should be prioritizing release milestone issues and PRs
@mbolivar-nordic I am in violent agreement there
@keith-zephyr escalation doesn't fall under reviewer expectations though. I'm still hearing pushback on proposals in situations where the author updates promptly but the maintainer doesn't respond soon. Do we want to have any language that encourages not perpetually leaving PRs in a blocked state?
@fabiobaltieri I think we have an entry in the release team responsibilities on removing stale blocks. I'm not sure if we have a definition for that, but in my mind it's 2 weeks.
@mbolivar-nordic no objections to discouraging that bad behavior as long as we're not writing it down as a requirement -- review bandwidth is a scarce resource
@keith-zephyr I am hearing a big pushback on any time bound, but having language that discourages certain behavior is a direction we can go
@dleach02 I guess one question is -- when can the author kick in an escalation path? Hopefully using pr-help and dev-review is a good mechanism, but if there's no resolution satisfactory to the author, what happens?
@galak TSC. I would also say pinging the maintainer directly on discord. Bribes also always help.
@keith-zephyr so that mostly talks about what the author should do to get things escalated. We might not want to have any language about timely reviews. I'd like to put something that encourages timely reviews
@mbolivar-nordic we do have that (https://docs.zephyrproject.org/latest/project/project_roles.html#maintainer). I think if we can rely on committers to be squeaky, that will increase visibility in situations where the maintainer is contended.
@keith-zephyr coming back to blocking PRs: we want to discourage blocking and taking a long time to review. But maybe we want to encourage prioritizing re-reviews of previously blocked PRs when possible. Not a requirement -- still have language that talks about release milestones being highest priority, but then maybe re-reviews of things you've blocked.
@mbolivar-nordic assuming we have a script for that, OK with me. I would get hundreds of threads a day in busy periods
@keith-zephyr I am curious about this script as well. I don't get nearly the email volume you were getting, but I still find it very noisy
@carlescufi to give you an idea, I have a full page in high resolution just in the last 3 hours
@fabiobaltieri the thing about GH notifications is they're super spammy
@keith-zephyr that raises another point - I don't see consistent behavior from authors to re-request reviews. I think some authors just push an update and people notice. Having the author explicitly re-request might be better.
@fabiobaltieri I think that won't always work, you need an explicit permission for this. I don't think we'll get consistent improvement on maintainers changing their behavior unless we can provide a common dashboard. The actionable thing right now I think is an escalation path and a timeline for authors to use it. Then we can write somewhere that maintainers are supposed to do something, but we have a ton of them and you can't tell them to do that and make sure they do it-- and they're also the people with the least incentive to merge things if they're conservatively maintaining.
@stephanosio I think #pr-help has been going well
@fabiobaltieri a question on that -- how do you follow it?
@stephanosio I do subscribe to notifications, so I get a popup and I sometimes check on it from time to time
@carlescufi here is the relevant command from Torsten's script: gh pr list $argv --limit 120 --search "reviewed-by:@me -review:changes-requested is:open" --json title,number,id,reviews,reviewDecision,milestone -t
@stephanosio same query works in search: https://github.com/zephyrproject-rtos/zephyr/pulls?q=is%3Apr+is%3Aopen+reviewed-by%3A%40me+review%3Achanges-requested+
@mbolivar-nordic for me this includes PRs which I did not request changes in
@carlescufi so there must be some more processing to do, maybe based on the JSON stream. The query just means "I have reviewed AND the current state is changes-requested". The postprocessing Torsten does performs the extra filtering.
@keith-zephyr I need to drop in another minute, but I will post a new version of the PR with reviewer expectations calling out that maintainer bandwidth is limited and add some language that encourages priority order for reviews (release milestones, anything you've blocked), and if we can get a script for this, all the better. Then also document the escalation path and encourages authors to use it.

keith-zephyr commented 1 year ago

Update for 2023-01-03. I will post a new version of the Committer/Reviewer expectations shortly. I won't be able to attend the Process meeting on 2023-01-04, so please defer discussion until next week.

keith-zephyr commented 1 year ago

Update for 2023-01-11: New version has been posted: https://github.com/zephyrproject-rtos/zephyr/pull/52731

I welcome comments on the entire doc, but the areas I want focus the process meeting discussion include:

I'm happy if we get closure on just one item this week, so that there is time to discuss other issues.

mbolivar-nordic commented 1 year ago

Process WG: skipped due to lack of time

mbolivar-nordic commented 1 year ago

@keith-zephyr last discussion we had here was about reviewer expectations. We agreed there's no timeline but there would be a priority order added in the "reviewer expectations" document
@dleach02 something I've noticed is we have PRs with no review for extended time, including extra pings to maintainers, etc.
@keith-zephyr I added a new section as well about escalation mechanisms ('PR Escallation'). Question is if we have any formal language for who is allowed to dismiss stale reviews.
[Discussion ensues; 'assignee' has the power formally listed, but maybe we want other maintainers to have this power as well in their related subsystems]
@keith-zephyr let's go back to timelines for escalation -- any feedback?
@dleach02 from my team's perspective, they just want their PRs merged. My observation is that some types of PRs, like adding a driver for your IP, if you don't get feedback we should be able to just approve and merge; not sure. I think the escalation section on inactivity is catching this.
@fabiobaltieri 2 days seems a bit short; I think a minimum should be 1 week
@keith-zephyr this is just asking for help on #pr-help not pinging a maintainer
@fabiobaltieri to me, posting there is kind of asking someone to ping the maintainer
@nashif we also have an escalation process documented elsewhere (project_roles.rst); let's have this all in one place and make sure it reflects the previous discussion [AI @keith-zephyr ]
@keith-zephyr I definitely want to have some timelines. It does set an expecation, buti t sets it in 2 ways; one being it makes it so people don't ping discord within a couple of hours. Suggestion for pinging the discord channel?
@nashif to me, asking for help is something you can do right away, but that's not an escalation
@dleach02 I agree pr-help should be what it is, but there still has to be a relevant timeline with escalation
@fabiobaltieri I would be fine if the maintainer says they can't get to this for X weeks
@dleach02 if that happens, we need a mechanism for getting things merged
@fabiobaltieri I would expect the maintainer to designate an alternate, but we have a separate problem where there's lack of response
@teburd I maintain a couple of areas now; what I have a hard time doing is finding the information I need to figure out the status of PRs I'm responsible for. There's no overview with areas I'm a maintainer of that I can look at.
@mbolivar-nordic you're not always the assignee though; it's not realistic to ask maintainers to filter everything through the fire hose of pull requests. We need some sort of escalation process no matter what, so let's try to sort out the details here. Curious what other people think about the 2 day timeline to asking on pr-help
@MaureenHelm I agree that's much too short
@nashif not sure I would call it escalation. I submit PR reviews all the time that don't get reviews and I have to beg for reviews
@dkalowsk isn't this exactly what the TSC is for though; we have a problem and we need technical solutions to solve on the PR?
@mbolivar-nordic I think there are situations where the maintainer is doing what they can, but has too much to do to respond to all PRs. I want to see a result where we capture prioritization mechanisms that handle situations like that. If we are successful as project we will have more incoming PRs than bandwidth to review and respond to.
@nashif so that's a more general problem we we have too small of a bus factor
@carlescufi I couldn't agree more that it'd be great to have more collaborators/reviewers, but how? these things either come from individuals wanting to join (or more rarely mandates from employers), but we haven't really succeeded at attracting review talent. I don't blame us; I think it happens naturally.
@nashif we can always sit here in the meeting and say things aren't going to work, but we keep going around and around about this and we need to find a solution
@carlescufi we did try -- we made a call for maintainers, etc. It didn't help much. Not blaming anyone, just out of ideas here. Like has been said, if we had enough maintainers, we'd have fewer problems.
@mbolivar-nordic I think we need to be realistic that this is a general problem of successful open source projects: they have less review bandwidth than is needed to address incoming PRs. I'd be happy to be proven wrong and see a successful open source project which has more than enough reviewers and has solved this problem. I think we need to be realistic that we need to have a prioritization mechanism
[no disagreement on that; general consensus that we need an escalation mechanism that has a prioritization mechanism]
@dleach02 Want to return to the situation of adding a driver for an IP block. As long as it's just that, maintainer should be able to just ACK it assuming it's been tested. Maintainers should know they're not expected to look at every line of code, and trust the submitter.
@fabiobaltieri in that specific case, I would expect someone else from the company to approve it as well
@dleach02 so yeah, I did that in this case, I went in and approved it, and confirmed that the author had tested on real hardware and verified it
@fabiobaltieri for me, I do like to try to help out reviewing PRs from other googlers as a signal to maintainers that this has been given a first pass by someone who is more involved in the project
@dleach02 that's a good observation; I can give that guidance internally
@MaureenHelm I'll give a counterexample to that. When it comes to sensor drivers, I see a lot of new contributors to the project that aren't necessarily employees of the sensor vendor themselves. The maintainer reviews are the only time we're looking at the code. In that case, I do think I need to look at every line of code, and often there are a lot of issues to get through. It's a bandwidth issue to keep up with all of them.
[pause to make sure we agree we need to come up with a solution for limited review bandwidth in general]
@nashif I think we need a system where contributors know that if people are just contributing without reviewing; expectation that a small group of us are driving the whole project won't work
@keith-zephyr getting more reviewers is something that needs to be an ongoing project. Getting back to escalation - I have 4 steps here. Is there anything missing? Then I want to get back to consensus on timelines.
@carlescufi only thing I would say is if there's no adequate WG, what do you do? sensors has no WG
@mbolivar-nordic why can't we say "sorry, packet gets dropped" then in that situation
@nashif in terms of inactivity, I think we need to go back to the roles/responsibilities; early in the project we had Trivial/TSC labels, but also had a Maintainer label. I think we need separate paths for inactivity versus technical disagreement.
@keith-zephyr ok, good feedback. If we focus on lack of reviews, that would leave us with 2 steps: 1. #pr-help, 2. dev review. Anything else?
@fabiobaltieri how about opening an issue after pr-help?
@dleach02 I would do dev review first before opening an issue. not sure what you would do with an issue.
@nashif what do you do at the meeting? Meetings are an issue since not everyone can help
@fabiobaltieri I expect most of this can be sorted in pr-help
@galak I think the first step is pinging the maintainer
@nashif I think we need a step saying to verify that the assignee is the correct person, and then pinging them or figuring out the right person. Maybe pinging the release team
@mbolivar-nordic I think it can be really useful to have 10-15 minutes conversation between maintainer and submitter to start to unblock things, especially since a lot of us have bandwidth reserved for meetings as employees
@keith-zephyr I'll take the action to split into 2 tracks, and would ask that people look into reviewer expectations and priorities and provide feedback in the PR generally

nashif commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.

One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.

This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

MaureenHelm commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.

One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.

This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

I like it.

keith-zephyr commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement. One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable. This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC. I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work. @keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

I like it.

Agreed - I really like this proposal.

mbolivar-nordic commented 1 year ago

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

This will give a whole new meaning to dev-review...

Sure, let's try it!

mbolivar-nordic commented 1 year ago

Process WG:

we will start looking at the above filter in tomorrow's dev review to gain some experience triaging PRs that aren't getting enough traction
discussion about escalation timelines is tabled until we've had some time to gain that experience

@keith-zephyr I think what I'd like to get out of this today is to discuss @nashif 's issue for triaging PRs with no owner or comments, and to discuss timelines on the escalation side
@nashif this filter (https://github.com/zephyrproject-rtos/zephyr/pulls?page=1&q=is%3Apr+is%3Aopen+review%3Anone+sort%3Acomments-asc+draft%3Ano) is the set of "no reviews" PRs. It includes everything, including things that were just opened, but it's a good start.
@mbolivar-nordic proposal was to start looking over this list during dev-review to start looking at these, nudging people, maybe providing initial review, etc., right?
@nashif yes. E.g. https://github.com/zephyrproject-rtos/zephyr/pull/53575 is open and doesn't have reviews. I'm a reviewer, but that's because I'm a reviewer for all samples. One question for us to address here is "why is this not getting reviews?". I'm the only review requested and the bot hasn't requested anybody else. I think it would be a good idea to mark some number of these and discuss during the dev-review meeting what we need to do.
@keith-zephyr I was kind of curious -- my general experience with PRs is we assign too many reviewers, so why does this have only one reviewer and only one assignee
@nashif that's a good question and something we should start to address. This PR is in cmsis_rtos_v1; maybe we don't have a maintainer, or maybe the file filter in the MAINTAINERS file is not right, or maybe we can assign it based on common knowledge, or review it in place.
@keith-zephyr I think the net of this is that the proposal will catch these. And if we continue to see gaps in what the bot is doing, hopefully we can make improvements.
@dleach02 one of the problems with zephyrbot like you were highlighting elsewhere is that key people were missing because of the way it was selecting reviewers. I had proposed selecting maintainers before collaborators.
@nashif that's a github problem. I think @stephanosio looked into this.
@galak so is there an offline triaging of this, and then in dev-review, bandwidth permitting, having an oldest-first order?
@mbolivar-nordic looks like oldest first is sorted:created-asc (https://github.com/zephyrproject-rtos/zephyr/pulls?q=is%3Apr+is%3Aopen+review%3Anone+draft%3Ano+sort%3Acreated-asc). I don't think anyone is suggesting an additional offline traige.
@galak I think picking assignees is something we could do offline potentially.
@nashif to that point, the first thing we could look at is 'no assignee'. But another problem we are having is that there are assignees that aren't doing anything. So I think we need to start looking at this and pinging assignees, changing assignees, determining which assignees are not active, etc. This in-meeting process will highlight this.
@mbolivar-nordic who would be responsible for doing an additional offline process?
@galak that's a fair point -- as just discussed with backlog handling, I do think we need to start doing a bit better triage as a project offline. I don't have a great suggestion on who the group is (release engineers or that kind of group) right now.
@MaureenHelm I think we need to clarify the objectives that we want to achieve in the dev-review meeting. We have this list of PRs that aren't getting attention, and we want to get them attention. Historically the contributor would add a dev-review label to a PR, and we'd take a look at it. The discussion today is about how we can make process improvements with filtering etc. I agree we need to do those process improvements, but I think we need to figure out why this isn't happening and how we can prevent this from happening again.
@nashif I agree -- we need a game plan for what we're going to do in the dev-review so it's clear that people can't expect having their PRs reviewed and merged in the meeting, for example. I think that once we do this 3 or 4 times, we'll start to see the pattern of what's happening (lack of assignees, unresponsive assignees, things that fall through the cracks, etc.). We can learn from this and meet again in a month or two to see how we can improve things (changing MAINTAINERS, etc.). Maybe there are more involved process changes, but I think we should wait a bit until after we've done this learning
@MaureenHelm what I think we should do is start off by looking at what has already fallen through the cracks. We can start with this filter, potentially with some tweaks, and then see after we've done this for a little while, see how we can stop that from happening.
@mbolivar-nordic so it sounds like we'll start with something like this filter and muddle our way through until we start to see the pattern
@galak two changes I would propose is PRs that are failing CI and PRs with DNM
@nashif I would still look at failing CI -- it's an opportunity for us to tell new contributors to go fix their CI
@fabiobaltieri CI breaks as well, every now and then, and new contributors don't have the ability to restart it
@MaureenHelm I do think we're going to have to try it out and see how it goes; we should still also have the option open for people to assign the dev-review label
@mbolivar-nordic are you happy with that conclusion, @keith-zephyr ?
@keith-zephyr yes, we have consensus to try to triage these in the dev review meeting. That will handle new PRs that aren't getting any traction. There is a separate issue of PRs that have a review but have gone stale. One thing I'm seeing is that we do have PRs with comments but are showing up; what's causing that?
@nashif I think what's happening is that github is removing review status for PRs that have been changed after a previous review, or items that just received dcomments but no review
@keith-zephyr so if that is indeed capturing both cases, and there's bandwidth to look at both of them, that's a big win
@nashif I do dislike how github is showing this. I was looking into whether we could write our own queries. I'll see if it's easier to help whoever is running this meeting to run a script and open a spreadsheet or something like that than just looking at the raw filter. I'll take the AI to look at this, not for tomorrow, but as we go.
@MaureenHelm I agree with you @keith-zephyr -- I don't think we should limit ourselves to "no reviews" -- I do think PRs that have reviews but went stale should be in scope.
@keith-zephyr OK, then I think in terms of quantifying timelines for what committers should do to escalate, it's fair to table that until we have some experience

keith-zephyr commented 1 year ago

@keith-zephyr - I have a conflict for the meeting 2023-02-01. So please defer discussion to next week.

mbolivar-nordic commented 1 year ago

Process WG:

@keith-zephyr: only changes not captured from last week are what the agreed-upon escalation timelines are (1 week to ping reviewers, 2 weeks to ping #pr-help and the dev-review meeting). Will get that captured. Otherwise I think this is getting pretty close, and I would ask everyone give this another pass over and make sure there's no real objections.
@keith-zephyr there have been requests from people not on this call. Committer expectations defines 'small' and 'large' PRs. There's been a request to change the language to 'easily reviewed' PRs. Curious if anyone has strong opinions
@galak I'm with you because of the quantification. Also, if you got reviews, it was easy to review
@fabiobaltieri big and small is also not always about lines of code, but about complexity
@galak I know I've submitted small PRs that are a 2 line change, but the level of complexity to understand underlying assembly code involved etc is pretty high. I think the idea is to capture the general intent of separating things that are small or easy to review. I commented on PR last week where someone was requesting a rename in addition to other changes, saying that those are two separate PRs, either before or after the real change. Better to split things if possible.
@mbolivar-nordic (relay question to @MaureenHelm about whether we can document the dev-review triage process for PRs that need attention in this PR https://github.com/zephyrproject-rtos/zephyr/pull/52731)
@keith-zephyr I believe we've only gone through the process twice -- do we have enough to document and capture that, or hold off?
@MaureenHelm I think we decided last week to go ahead with what you had
@keith-zephyr I think the current content in the PR for that section is OK then -- request another full pass

zephyrproject-rtos / zephyr

Update process for complex features #50836

Introduction

44098 new sensor API

45370 new sensor subsystem

45601 USBC subsystem

Problem description

Proposed change

39194