zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.63k stars 6.51k forks source link

Update process for complex features #50836

Closed yperess closed 1 year ago

yperess commented 2 years ago

Introduction

The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for:

I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the main branch.

Problem description

Some PRs (especially for new subsystems) get too big to review in order to meet the quality requested by Zephyr. This includes:

By the time a PR includes all of these, it usually involves so many changes, that the review becomes a multi-discussion-threaded monster that drags on for months and ends up with poor velocity for the contributors. One of the concerns is that until the new logic is developer read with the above requirements, it is difficult to communicate to developers that it is still a WIP and will likely change.

Proposed change

yperess commented 2 years ago

@sambhurst @keith-zephyr

stephanosio commented 2 years ago

cc @mbolivar-nordic @carlescufi @nashif @MaureenHelm @galak

aaronemassey commented 2 years ago

Yes please. This would help considerably in doing incremental development on newer APIs, such as a battery API, versus having to make a monolithic PR that has tests, a driver, the API, a sample board, etc.

@teburd What are your thoughts on this?

teburd commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

carlescufi commented 2 years ago

@yperess maybe you could bring this up in the TSC?

yperess commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.

@carlescufi I'll be there at the TSC next week

keith-zephyr commented 2 years ago

@yperess maybe you could bring this up in the TSC?

I think the Process WG is a better place to start the discussion on this topic than the full TSC.

mbolivar-nordic commented 2 years ago

@yperess if you can "champion" the issue within the process WG, I am happy to add it to next week's agenda.

mbolivar-nordic commented 2 years ago

Initial thoughts:

Jumping into the weeds, I don't follow the semantics of the proposed experiments key for west.yml files. Where does this go (next to projects and self? Inside self? etc.). And then the west update semantics look weird too. It's also not clear how you want west to cherry pick branches on top of zephyr itself for feature branches in the zephyr repository. By design, west avoids modifications of the manifest repository itself.

Zooming back out given that, I think we should try to settle on the workflow you want to enable before figuring out if we want new west features and what they are. I'm not clear on what having two experiments enabled at once looks like, for example, especially if they both touch zephyr and zephyr is the manifest repository.

teburd commented 2 years ago

I'm actually quite fond of the requirements while reviewing code. Having the bare minimum of some docs, a sample, and some tests makes reviewing easier. I don't get to spend all day reviewing. I have my own commitments to features/bug fixes. Samples and docs are a nice way of showing how the API is meant to be used and why. For larger PRs I usually start immediately at the samples and docs, followed by test cases to try and understand what it is this large blob of code is trying to do/solve/fix/enable. This makes reviewing significantly easier honestly and gives an author a chance to present how its useful. That's important if you ever want to have other people contributing back.

This will still be a hard requirement for meeting the feature branch into main. The idea here is mostly to increase velocity between the start of the idea and finish. Instead of developing it locally inside one organization with little public visibility and dropping one large PR on the Zephyr community, companies can publicly develop these features with an easier potential for collaboration.

Can the project really manage feature branches though in one repository like this? It's been tried before in another form and didn't really work out.

I guess my question here would be if the goal is to build a collaborative place to work on code, e.g. a new sensor API, perhaps the linux model of a fork where collaborators can review and merge each others code, then polish it up on a branch and PR that to the upstream project.

Any sort of rebasing/cherry-picking/merging work can be maintained by the authors then with normal git usages. I'm not sure how west doing some cherry picking would work out. This seems like something the authors of a feature should be dealing with, rebasing as they go.

yperess commented 1 year ago

Blank diagram (2)

Here's a rough diagram of the flow I'm expecting. Having the experiments in the west yaml will trigger the following when calling west update:

  1. git fetch
  2. git checkout origin/main
  3. git pull --ff origin/sensors_v2
  4. git pull --ff origin/sensor_subsys

This enables the team working on this to be able to keep the work in upstream Zephyr (Zephyr owns the individual commits still). This way history is not lost. When we make the PR to merge the feature branch into Zephyr, that merge can be done as a squash merge. This makes sure that nobody can checkout a bad commit in the feature branch through main.

@mbolivar-nordic I blocked off the time for the process WG. I'll be there.

pdunaj commented 1 year ago

The current workflow of having fully flushed out PRs for new features doesn't scale well. See for example PRs for: ... I'd like us to revisit a more iterative approach that will still guarantee the same high quality PRs into the main branch.

I am not sure this is the only problem. You can have small PR that drags for months or complete solution that is blocked by one person because of "I don't like it". It's fine if somebody does not have better things to do, but if you develop product and have deadlines you will not wait months for code to get in. And after you have code in private repo you lose incentive to share it here, especially when you expect a long and tiering fight.

Personally, I would:

So, if something is widely accepted it goes to mainline. If something is questioned, but is not garbage, it goes into staging to allow it to become mature.

I personally avoid pushing anything here because it takes ridiculous amount of effort to be done :(

yperess commented 1 year ago

For the meeting today I want to highlight the focal points that this is trying to solve (for us):

  1. The review process takes too long and it's difficult to show velocity.
  2. Once the review started, developers generally want to keep working. Long reviews means that if I add more commits ontop of my work, I'll be adding them to the same PR (since GitHub does per branch review and not per commit). This just complicates things further (Gerrit solves this by doing a per commit review).
  3. PRs don't retain good history. If I get actionable feedback, I'll amend a commit. Meaning I have to force push. At this point, a lot of comments are no longer relevant or can't be mapped to a specific line of code. If we want to keep that history, we would need to add commits to the PR to address the comments, but that also means that it's likely that some commits will be "broken" and the PR should be merged with a squash.
  4. After a force push, I often have to re-review the whole change instead of just the delta because of the issue above

While I just flat out dislike GitHub, I don't see us moving away from it. I believe that feature branches with experiments will at least alleviate the above pain points.

mbolivar-nordic commented 1 year ago

Process WG:

yperess commented 1 year ago

Update

I would like to propose that we don't try to get this perfect on the first go. Lets come up with some guidelines that we think might work and have a test run at the process with a retrospective and some checkpoints along the way. I'd be happy to use the sensors v2 work as that case study. My current thinking is:

  1. Get a topic branch created
  2. Merge commits that pass CI but may break features
  3. When the topic branch contains a fully implemented feature, at least 1 sample, tests, and documentation create a PR into main
  4. As comments come in on the PR add commits to the topic branch (no force pushing will be allowed)
  5. When the PR is ready, we can discuss how it will be merged. My gut feeling is that it'll be too difficult to refactor in a reasonable time frame and we'd end up squash merging the PR then locking the topic branch (but this can be hashed out later).

I believe that what we'll see is that it's much easier to re-review the PR without force pushing since you can just look at the new commits since your comment and see how things changed since you asked for changes. I believe that this will out-weigh the downsides of the squash merge (but again, we can figure that out when the PR is ready to be merged).

mbolivar-nordic commented 1 year ago

Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.

fabiobaltieri commented 1 year ago

Is there a real difference between a proper upstream topic branch and how PRs are implemented in GitHub? If anything having a branch that hosts "dirty" changes and have known broken intermediate states feels like it may make reviewing the whole change even harder and increase the potential of submitting bad code. Imagine this situation: you do the work broken into coherent patches to represent development history and development phases, then you keep doing incremental modifications and change everything in small bits for the sake of faster iterations. In the meantime you still have to periodically sync up against main, otherwise you can't pull back. The resulting branch is now impossible to review to see if the change is coherent, the only thing one could do is review a diff against main, which is not really something you can leave comment on anymore.

I think that this should be tackled differently: the problem is that some PRs are too large, they include a mix of features, small changes, refactoring, documentation samples etc... How about this: when we work on a PR like that, the author and reviewers work together to split it into multiple PRs and merge it a bit at a time -- this would be the kernel equivalent of sending a series on a mailing list, and have the maintainer accepting some of the patches and asking for changes in other. That way we would have a proper progression on the change and reduce the size of the problem a bit at a time, which should make the development cycle faster.

gmarull commented 1 year ago

Another idea here:

Nowadays, one can develop everything in a module: custom boards, APIs, bindings, tests, etc. An experimental feature could start as a module (flagged as experimental, not pulled by default) and once mature enough, be integrated in the main branch. Or even stay as a module if preferred. I think that this would allow making progress much faster, even have 2 competing implementations people can play with, etc. We could even have an entire platform support in a module: soc/drivers/dts, etc.

keith-zephyr commented 1 year ago

Process WG: no consensus, continue discussion. Strong opposition to squash merges was voiced from multiple participants.

  • @MaureenHelm: should move towards incremental development of breaking large changes into smaller PRs

I'm in favor of pivoting the discussion in this direction. Regardless of whether we employ topic branches, smaller more incremental PRs will make reviewing easier.

Some thoughts on this:

mbolivar-nordic commented 1 year ago

Process WG:

galak commented 1 year ago

A few comments:

  1. Document that we should ask PRs that are associated with a RFC always put a reference to that RFC when being submitted
  2. Document that if a PR is taking a while to merge, that reviewers/maintainers request submitter to pull out commits that are concerned acceptable into a separate PR to reduce amount of re-review, making progress, etc.
  3. For new APIs we should request that the API is limited to what is needed and can be demonstrated in a sample or test.
fabiobaltieri commented 1 year ago

For new subsystem or features, should we have a process to break down the feature into areas and manually assign a set of approvers to decide when the initial implementation is "good enough" for merging? For the usbc case for example it could have made sense to explicitly nominate someone to approve for usb, devicetree, device implementation (stm32 in that case), that way it would be clear who is driving the review. One active case is the zbus PR (https://github.com/zephyrproject-rtos/zephyr/pull/48509), few people contributed a review and approved at some stage, but there's no clear set of reviewers that would have to approve for the initial implementation to go, so everyone is waiting for the next round of review to be applied, for the latest reviewer to approve, then someone does another pass and the cycles continue.

Maybe a process could be that as part of the RFC we identify the affected areas for the change, and then when the RFC gets discussed in the Architecture WG we assign someone to every area (a mainainer or collaborator) that commits to track the PR, drive the review and approve it, trying to keep iterations as quick as possible.

keith-zephyr commented 1 year ago

Looking at the documentation again, there appears to be 3 areas of the documentation that intersect with "large PRs"

  1. Contribution Guidelines
  2. Proposals and RFCs
  3. API Lifecycle

The Contribution Guidelines is massive - 4500 words and needs some cleanup. The Other Commit Expectations has some of the clearest guidance, but needs some expansion:

The Commit Expectations could possibly get promoted up a level so it's not lost among the large Contribution Guidelines. Something like the Chromium Firmware Guidelines could be a model. Google's Small CLs doc also has good content.

The Proposal and RFCs document should encourage the author to define a minimum viable implementation. Once the minimum implementation merges, it makes collaboration easier. I like the proposal from @fabiobaltieri of using the RFC to designate assignees for various blocks.

The API lifecycle is the only place I see that talks about providing documentation, one implementation, and one sample (but omits mentioning tests). We could also add these requirements directly to the Commit Expectations, and just point the API and RFC docs back to the Commit Expectations.

mbolivar-nordic commented 1 year ago

Process WG:

Summary: general consensus on @keith-zephyr and @fabiobaltieri 's proposals, @keith-zephyr to start work on updating the docs

marc-hb commented 1 year ago

PRs don't retain good history. If I get actionable feedback, I'll amend a commit. Meaning I have to force push. At this point, a lot of comments are no longer relevant or can't be mapped to a specific line of code. If we want to keep that history, we would need to add commits to the PR to address the comments, but that also means that it's likely that some commits will be "broken" and the PR should be merged with a squash. After a force push, I often have to re-review the whole change instead of just the delta because of the issue above While I just flat out dislike GitHub, I don't see us moving away from it.

For the record Github's review model and limitations were discussed one year ago in

we've decided not to replace GitHub, but there is still an open question of how to improve the review process in general.

mbolivar-nordic commented 1 year ago

Process WG: work is ongoing, revisit next week

mbolivar-nordic commented 1 year ago

Process WG:

mbolivar-nordic commented 1 year ago

Process WG:

Consensus on testing requirements: all API functions should have test cases, and there should be tests for behavior contracts, along with examples for where it's been done well. Leave it to reviewer discretion for now, knowing that a better solution is desirable, but this is our starting point.

mbolivar-nordic commented 1 year ago

Process WG:

keith-zephyr commented 1 year ago

Update for 2023-01-03. I will post a new version of the Committer/Reviewer expectations shortly. I won't be able to attend the Process meeting on 2023-01-04, so please defer discussion until next week.

keith-zephyr commented 1 year ago

Update for 2023-01-11: New version has been posted: https://github.com/zephyrproject-rtos/zephyr/pull/52731

I welcome comments on the entire doc, but the areas I want focus the process meeting discussion include:

I'm happy if we get closure on just one item this week, so that there is time to discuss other issues.

mbolivar-nordic commented 1 year ago

Process WG: skipped due to lack of time

mbolivar-nordic commented 1 year ago
nashif commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.

One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.

This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

MaureenHelm commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement.

One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable.

This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC.

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

@keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

I like it.

keith-zephyr commented 1 year ago

found this filter useful to see where the issues with regard to no-reviews. It might help us zero down on the problems and act on stale PRs due to lack of reviews or engagement. One option we have is to load this type of filter or something similar in the dev-review on a weekly basis and go through PRs not getting attention and do some initial community review, assign maintainers where things have been missed, ping unresponsive maintainers and eventually even review and drive to merge where applicable. This is going to be a proactive measure and we will not have to wait for someone going through the escalation path. Some escalation path is needed still to bring corner cases to the attention of the maintainers and finally the TSC. I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work. @keith-zephyr @mbolivar-nordic @MaureenHelm what do you think?

I like it.

Agreed - I really like this proposal.

mbolivar-nordic commented 1 year ago

I would suggest to go through such filter in the process meeting first, put some rules in place for the process of going through this in the dev-review meeting and see how this will work.

This will give a whole new meaning to dev-review...

Sure, let's try it!

mbolivar-nordic commented 1 year ago

Process WG:


keith-zephyr commented 1 year ago

@keith-zephyr - I have a conflict for the meeting 2023-02-01. So please defer discussion to next week.

mbolivar-nordic commented 1 year ago

Process WG: