Closed yperess closed 1 year ago
@sambhurst @keith-zephyr
cc @mbolivar-nordic @carlescufi @nashif @MaureenHelm @galak
Yes please. This would help considerably in doing incremental development on newer APIs, such as a battery API, versus having to make a monolithic PR that has tests, a driver, the API, a sample board, etc.
@teburd What are your thoughts on this?
I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.
@yperess maybe you could bring this up in the TSC?
I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.
This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.
@carlescufi I'll be there at the TSC next week
@yperess maybe you could bring this up in the TSC?
I think the Process WG is a better place to start the discussion on this topic than the full TSC.
@yperess if you can "champion" the issue within the process WG, I am happy to add it to next week's agenda.
Initial thoughts:
Jumping into the weeds, I don't follow the semantics of the proposed experiments
key for west.yml files. Where does this go (next to projects
and self
? Inside self
? etc.). And then the west update
semantics look weird too. It's also not clear how you want west to cherry pick branches on top of zephyr itself for feature branches in the zephyr repository. By design, west avoids modifications of the manifest repository itself.
Zooming back out given that, I think we should try to settle on the workflow you want to enable before figuring out if we want new west features and what they are. I'm not clear on what having two experiments enabled at once looks like, for example, especially if they both touch zephyr and zephyr is the manifest repository.
I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.
This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.
Can the project really manage feature branches though in one repository like this? It's been tried before in another form and didn't really work out.
I guess my question here would be if the goal is to build a collaborative place to work on code, e.g. a new sensor API, perhaps the linux model of a fork where collaborators can review and merge each others code, then polish it up on a branch and PR that to the upstream project.
Any sort of rebasing/cherry-picking/merging work can be maintained by the authors then with normal git usages. I'm not sure how west doing some cherry picking would work out. This seems like something the authors of a feature should be dealing with, rebasing as they go.
Here's a rough diagram of the flow I'm expecting. Having the experiments in the west yaml will trigger the following when calling west update
:
git fetch
git checkout origin/main
git pull --ff origin/sensors_v2
git pull --ff origin/sensor_subsys
This enables the team working on this to be able to keep the work in upstream Zephyr (Zephyr owns the individual commits still). This way history is not lost. When we make the PR to merge the feature branch into Zephyr, that merge can be done as a squash merge. This makes sure that nobody can checkout a bad commit in the feature branch through main
.
@mbolivar-nordic I blocked off the time for the process WG. I'll be there.
The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for: ... I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the main branch.
I am not sure this is the only problem. You can have small PR that drags for months or complete solution that is blocked by one person because of "I don't like it". It's fine if somebody does not have better things to do, but if you develop product and have deadlines you will not wait months for code to get in. And after you have code in private repo you lose incentive to share it here, especially when you expect a long and tiering fight.
Personally, I would:
So, if something is widely accepted it goes to mainline. If something is questioned, but is not garbage, it goes into staging to allow it to become mature.
I personally avoid pushing anything here because it takes ridiculous amount of effort to be done :(
For the meeting today I want to highlight the focal points that this is trying to solve (for us):
While I just flat out dislike GitHub, I don't see us moving away from it. I believe that feature branches with experiments will at least alleviate the above pain points.
Process WG:
Update
I would like to propose that we don't try to get this perfect on the first go. Lets come up with some guidelines that we think might work and have a test run at the process with a retrospective and some checkpoints along the way. I'd be happy to use the sensors v2 work as that case study. My current thinking is:
main
I believe that what we'll see is that it's much easier to re-review the PR without force pushing since you can just look at the new commits since your comment and see how things changed since you asked for changes. I believe that this will out-weigh the downsides of the squash merge (but again, we can figure that out when the PR is ready to be merged).
Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.
Is there a real difference between a proper upstream topic branch and how PRs are implemented in GitHub? If anything having a branch that hosts "dirty" changes and have known broken intermediate states feels like it may make reviewing the whole change even harder and increase the potential of submitting bad code. Imagine this situation: you do the work broken into coherent patches to represent development history and development phases, then you keep doing incremental modifications and change everything in small bits for the sake of faster iterations. In the meantime you still have to periodically sync up against main, otherwise you can't pull back. The resulting branch is now impossible to review to see if the change is coherent, the only thing one could do is review a diff against main, which is not really something you can leave comment on anymore.
I think that this should be tackled differently: the problem is that some PRs are too large, they include a mix of features, small changes, refactoring, documentation samples etc... How about this: when we work on a PR like that, the author and reviewers work together to split it into multiple PRs and merge it a bit at a time -- this would be the kernel equivalent of sending a series on a mailing list, and have the maintainer accepting some of the patches and asking for changes in other. That way we would have a proper progression on the change and reduce the size of the problem a bit at a time, which should make the development cycle faster.
Another idea here:
Nowadays, one can develop everything in a module: custom boards, APIs, bindings, tests, etc. An experimental feature could start as a module (flagged as experimental, not pulled by default) and once mature enough, be integrated in the main branch. Or even stay as a module if preferred. I think that this would allow making progress much faster, even have 2 competing implementations people can play with, etc. We could even have an entire platform support in a module: soc/drivers/dts, etc.
Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.
- @MaureenHelm: should move towards incremental development of breaking large changes into smaller PRs
I'm in favor of pivoting the discussion in this direction. Regardless of whether we employ topic branches, smaller more incremental PRs will make reviewing easier.
Some thoughts on this:
Process WG:
A few comments:
For new subsystem or features, should we have a process to break down the feature into areas and manually assign a set of approvers to decide when the initial implementation is "good enough" for merging? For the usbc case for example it could have made sense to explicitly nominate someone to approve for usb, devicetree, device implementation (stm32 in that case), that way it would be clear who is driving the review. One active case is the zbus PR (https://github.com/zephyrproject-rtos/zephyr/pull/48509), few people contributed a review and approved at some stage, but there's no clear set of reviewers that would have to approve for the initial implementation to go, so everyone is waiting for the next round of review to be applied, for the latest reviewer to approve, then someone does another pass and the cycles continue.
Maybe a process could be that as part of the RFC we identify the affected areas for the change, and then when the RFC gets discussed in the Architecture WG we assign someone to every area (a mainainer or collaborator) that commits to track the PR, drive the review and approve it, trying to keep iterations as quick as possible.
Looking at the documentation again, there appears to be 3 areas of the documentation that intersect with "large PRs"
The Contribution Guidelines is massive - 4500 words and needs some cleanup. The Other Commit Expectations has some of the clearest guidance, but needs some expansion:
The Commit Expectations could possibly get promoted up a level so it's not lost among the large Contribution Guidelines. Something like the Chromium Firmware Guidelines could be a model. Google's Small CLs doc also has good content.
The Proposal and RFCs document should encourage the author to define a minimum viable implementation. Once the minimum implementation merges, it makes collaboration easier. I like the proposal from @fabiobaltieri of using the RFC to designate assignees for various blocks.
The API lifecycle is the only place I see that talks about providing documentation, one implementation, and one sample (but omits mentioning tests). We could also add these requirements directly to the Commit Expectations, and just point the API and RFC docs back to the Commit Expectations.
Process WG:
Summary: general consensus on @keith-zephyr and @fabiobaltieri 's proposals, @keith-zephyr to start work on updating the docs
PRs don't retain good history. If I get actionable feedback, I'll amend a commit. Meaning I have to force push. At this point, a lot of comments are no longer relevant or can't be mapped to a specific line of code. If we want to keep that history, we would need to add commits to the PR to address the comments, but that also means that it's likely that some commits will be "broken" and the PR should be merged with a squash. After a force push, I often have to re-review the whole change instead of just the delta because of the issue above While I just flat out dislike GitHub, I don't see us moving away from it.
For the record Github's review model and limitations were discussed one year ago in
we've decided not to replace GitHub, but there is still an open question of how to improve the review process in general.
Process WG: work is ongoing, revisit next week
Process WG:
Process WG:
Consensus on testing requirements: all API functions should have test cases, and there should be tests for behavior contracts, along with examples for where it's been done well. Leave it to reviewer discretion for now, knowing that a better solution is desirable, but this is our starting point.
native_posix
with codecov. If a feature is not testable with that board, you won't get coverage results. QEMUs are supported as well; I think currently x86 and arm; others should be able to be supported as well. But the way you implement APIs there will be totally different for a real application. Several PRs and issues opened over this.Process WG:
#pr-help
as escalation)#pr-help
has been going wellgh pr list $argv --limit 120 --search "reviewed-by:@me -review:changes-requested is:open" --json title,number,id,reviews,reviewDecision,milestone -t
Update for 2023-01-03. I will post a new version of the Committer/Reviewer expectations shortly. I won't be able to attend the Process meeting on 2023-01-04, so please defer discussion until next week.
Update for 2023-01-11: New version has been posted: https://github.com/zephyrproject-rtos/zephyr/pull/52731
I welcome comments on the entire doc, but the areas I want focus the process meeting discussion include:
I'm happy if we get closure on just one item this week, so that there is time to discuss other issues.
Process WG: skipped due to lack of time
#pr-help
not pinging a maintainer#pr-help
, 2. dev review. Anything else?found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.
One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.
This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.
I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.
@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?
found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.
One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.
This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.
I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.
@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?
I like it.
found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement. One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable. This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC. I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work. @keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?
I like it.
Agreed - I really like this proposal.
I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.
This will give a whole new meaning to dev-review...
Sure, let's try it!
Process WG:
sorted:created-asc
(https://github.com/zephyrproject-rtos/zephyr/pulls?q=is%3Apr+is%3Aopen+review%3Anone+draft%3Ano+sort%3Acreated-asc). I don't think anyone is suggesting an additional offline traige.@keith-zephyr - I have a conflict for the meeting 2023-02-01. So please defer discussion to next week.
Process WG:
#pr-help
and the dev-review meeting). Will get that captured. Otherwise I think this is getting pretty close, and I would ask everyone give this another pass over and make sure there's no real objections.
Introduction
The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for:
44098 new sensor API
45370 new sensor subsystem
45601 USBC subsystem
I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the
main
branch.Problem description
Some PRs (especially for new subsystems) get too big to review in order to meet the quality requested by Zephyr. This includes:
By the time a PR includes all of these, it usually involves so many changes, that the review becomes a multi-discussion-threaded monster that drags on for months and ends up with poor velocity for the contributors. One of the concerns is that until the new logic is developer read with the above requirements, it is difficult to communicate to developers that it is still a WIP and will likely change.
Proposed change
west.yaml
manifest which will allow projects to enable experiments. When enabled,west update
will cherry-pick the corresponding branch on-top of Zephyrfeature-X
for this example)main
is the responsibility of the experiment developersmain
.