Working with Github Pull Requests

snabbco / snabb

Snabb: Simple and fast packet networking

Apache License 2.0

2.98k stars 301 forks source link

Working with Github Pull Requests #725

Open lukego opened 8 years ago

lukego commented 8 years ago

... inspired by #702 here are two ideas and one non-idea for improving the workflow for Github Pull Requests: marking the intention of a PR, marking the person responsible for merging, and (non-idea) marking the next step needed for the PR.

PR intent

In theory a Pull Request has a clear intent: please pull the changes from my branch onto the target branch. In practice though we use (misuse?) PRs to the master branch as a general way to get attention on code e.g. for "Here is something that I am working on" or "Here is a really rough idea". Problems arise when the intent of a PR is misunderstood, particularly this can lead to branches being merged too early (oh, I thought that was finished) or too late (or, I thought that was still a work in progress).

The idea of solving this is to use textual tags at the start of the PR title:

[sketch] marks a throw-away branch for exploring an idea. This branch is never to be merged. If the idea pans out then it will be developed properly on a new branch with a separate PR. Code review will focus on the idea.
[wip] marks a work-in-progress branch. This branch is intended to be developed to completion without any rebasing, and you can merge it if you want to, but it is probably not working yet. Code review will focus on the design.
If there is no tag then it means the author wants the changes to be merged into other branches. Code review will focus on any barriers to merging.

The reason to choose textual tags in the title, rather than Labels, is to avoid the whole Github permissions thing i.e. use a field that the PR creator always controls.

If you think a PR is mislabeled then you can ask the creator to consider renaming it.

Responsible maintainer

For each PR it should be clear who (if anybody) is responsible for bringing this upstream. We can all merge PR branches into our own branches but it should always be clear what is the "next hop" towards the master branch. The choice of next hop depends on the subject matter of the PR i.e. who is the responsible maintainer for this kind of change.

The idea is that each PR will be assigned to a specific "topic" by a maintainer. For each topic there will be an entry in branches.md saying what is the next-hop branch and who maintains that.

The topics could be (looking slightly ahead of current practice):

next for collections of reviewed/accepted changes that should go into the next release.
intel for changes to device drivers for Intel cards.
doc for documentation changes.
ports for portability targeting new platforms.
general for changes that don't match a more specific topic.

Once the topic is assigned then contributors will know what to expect next. If a PR is marked general then branches.md will tell you the next step is for @eugeneia to merge it onto max-next. If a PR is marked ports then branches.md will tell you the next step is for somebody to volunteer to maintain a portability branch i.e. there is no upstream step available yet.

I am especially keen on the capability to flag changes as belonging to a topic that is not currently maintained. This would allow us to neatly collect changes for such topics until a maintainer steps forwards. For example POSIX-compliance changes like #723 (and many before) could be neatly collected until such time as somebody decides to maintain a branch for that topic.

The idea to use labels is that they are clearly visible and easy for maintainers to assign. I suppose that the "target branch" of the Pull Request would be more appropriate but I am not sure how easy that is to change or how likely people are to make the right choice when creating their PR.

next step

It would also seem nice if the next step for a PR were immediately apparent e.g. is the submitter waiting on the reviewer or vice-versa. I don't have a simple idea for solving this though. Perhaps a convention that people try to be clear in discussions on exactly what they expect from the other party e.g. saying things like "I have addressed all the review comments above and now I am waiting for merge."

lukego commented 8 years ago

Sorry to write this idea up with so many words!

wingo commented 8 years ago

Thanks for this! More consideration of the intent of a PR sounds good. Since a PR is the only good way in Github to talk about a potential change to code, I don't think using them for sketches or WIP work is mis-use in any way. It does sound like explicitly tagging a PR with an intent can help the reviewer know what is being asked of them. I suppose it's possible for a PR to change intents as it goes, so what is being asked of the reviewer can change too.

To change the target branch of a PR, you have to close it and create a new one. To me I don't think it is unreasonable to ask the submitter to choose the target branch appropriately, nor is it unreasonable for a reviewer to see a change and say "this looks more appropriate for XXX branch, would you mind closing this one and creating a new PR for that branch, see foo.md". That would avoid to have to have the concept of topics reified anywhere :)

One more concrete question: Is the intention for all commits to go through max-next? I have been making my PRs against next. I'm happy to change, though I wonder if we're not putting too much of a burden on @eugeneia to have to be the sole reviewer for all things Snabb.

lukego commented 8 years ago

One more concrete question: Is the intention for all commits to go through max-next? I have been making my PRs against next. I'm happy to change, though I wonder if we're not putting too much of a burden on @eugeneia to have to be the sole reviewer for all things Snabb.

We are bootstrapping a graph/tree structure for merges (like the Linux kernel has).

Here is how it should be:

PRs are "sharded" across target branches based on their subject matter (driver change, doc change, portability change, collection of changes already accepted by a particular person, etc).
Target branch has one responsible person who reviews/tweaks/merges changes and then sends them in a new PR to the next hop upstream (e.g. next).
The graph scales up by adding more nodes i.e. branches that are being maintained by somebody. These could be leaf nodes (where new PRs are sent directly) or internal nodes (like next that exists to combine downstream nodes).

Here is a picture of how this could look in practice one day:

scan_20160122_0001

In that case new PRs are entering the flow on the right-hand side and flowing with the current towards the master branch.

However, we are not there yet and we need help from people who will volunteer to maintain such a branch!

Last year we had:

master <- next

i.e. just me deciding what to merge and when. This kind-of worked before my second child was born :) and then I became overloaded and releases did not flow so smoothly.

The struct as of this month is:

master <- next <- max-next

which is one step forwards. On the one hand the same bottleneck exists (but it's Max instead of me) but on the other hand there are two of us approving all of the changes now and that will hopefully decrease the work we each do (we'll see...)

However: we really do need more maintainers who are willing to have changes on a particular topic PR'd to them that they will review/merge and then push onwards to the next hop (e.g. next or max-next).

Help wanted :-)

lukego commented 8 years ago

... well another nice aspect of having both next and max-next is that allows the graph to grow in more ways i.e. somebody could agree with me to push changes to next or they could agree with Max to push changes to max-next. And once those agreements are made then it can continue to grow with people individually making these arrangements.

This is my understanding of how the kernel (and also e.g. QEMU) scales its development.

eugeneia commented 8 years ago

Regarding PR target branch: Currently I would recommend always targeting master because it reflects what the CI is doing (it will try to merge with master and test the result). Resolving merge conflicts when merging into a release candidate branch like max-next is the responsibility of that respective branch's maintainer.

lukego commented 8 years ago

@eugeneia @wingo re "targetting" there are two independent issues:

Which branch do you base your work on. This should be master if possible i.e. a stable common ancestor of all other up-to-date branches.
Which branch you send your Pull Request to. This seems more social than technical i.e. signals who you are asking to review and merge your changes (which are based on master, not that person's own branch, except when exceptions are needed).

wingo commented 8 years ago

I don't understand the testing story with master and next; thank you for bearing with my ignorance :) For example, in V8 there is no "next" branch. Everything is done on master, and sometimes things branch off for release. Before the release, master could be frozen, to allow for more extensive testing. If I make a change, I make it against the latest code, and I test it before and after. I validate it and the bots validate it and if it's reviewed it can go in. If later it turns out to not be a good change it's reverted. All browser engines that I've worked on work this way.

OK, so that's not our model: fine. However I don't really get the testing story for us. If changes accumulate in these big buffers like max-next, how can I ever hope either (a) to test the code on my own against the proper corresponding tip-of-tree if I base the PR on master, or (b) know that when merged that my code would have a particular effect or not? I especially wouldn't want the person doing the merge to have to fix things up, because they might make a mistake. I would prefer them to reject the merge in that case!

wingo commented 8 years ago

Basically I get that this branching structure reduces the burden on you, @lukego, and I really respect that :) I also like that releases from master roll out every month in a really regular way, and I think @eugeneia is doing some great reviewing and release management there. So as a user of the main snabb trees things are great. When I try to contribute though I see more and more barriers. I don't know how to test my changes to make sure they are correct, and now it seems we have serialized the review process through first {Max then you, topic branch then next}, instead of parallelizing. Dunno. I am happy to maintain a topic branch but I don't see how my work would avoid conflicting with Max's -- if Max has ownership on all core/ and lib/ changes and also core apps, well, there's nothing else besides specialized apps, right?

lukego commented 8 years ago

Great pointed questions, @wingo!

lukego commented 8 years ago

@wingo some responses. and note that I am learning as we go along i.e. we are incrementally adopting practices from the Linux kernel workflow as we need them to scale up.

Thoughts:

The "where the next release is being debugged" branch is next. Changes should land here quickly for testing and integration with each other. That has not been the case recently and this is the acute problem we need to fix now.

For many people I think it makes life easier to be able to base their changes on the master branch (a stable base) and have somebody else worry about merging it with other changes happening in parallel. That's why I recommend it as the default behaviour for people adding features.

I believe Linus also asks people not to pull the tip of his tree into their development branches and to prefer building on a stable release tag. The master branch serves this role for Snabb i.e. you can assume that whatever you pull from master is the latest release that is recommended as the basis for new work.

now it seems we have serialized the review process through first {Max then you, topic branch then next}, instead of parallelizing

I see this as N-way parallelism where N=2 and one node is reserved for coordination :-). I would like to solve this by increasing N i.e. having more maintainers.

I also hope that we have pipelined the merge process already. Max is reviewing/merging changes more quickly than I did, which is fantastic, and I still get an editorial voice on "Snabbyness", which I think is important at this stage of the project's evolution. So I hope that the max-next -> next -> master workflow will mean changes moving forward more smoothly than last year. However, there is a learning curve now to adjust to the new workflow of dealing with changes in batches.

I do think that this works well once the working habits are established. In the Linux kernel community it seems like most of the time when somebody says "Please pull these 20 new features from my new-networking-features branch" the reply is "Applied, thanks." The other cases where discussion is needed are the ones where people are getting into sync on their communication styles / expectations / etc.

I am happy to maintain a topic branch

This would be awesome!

So now we are three volunteer maintainers: @lukego, @eugeneia, @wingo (and maybe somebody else will raise their hand).

but I don't see how my work would avoid conflicting with Max's

The basic solution I see is sharding i.e. quickly assigning each incoming PR to one maintainer who will be the immediate "upstream". (Then we also need a default upstream for PRs that don't match somebody specific: that used to be me but now it's Max.)

The kernel resolves this by having the MAINTAINERS file that divides up the source tree into subsystems with their own maintainers. Changes to a subsystem are sent to the subsystem maintainer, who is the best qualified person to evaluate and merge it, and they then send it upstream to their next hop. (The kernel seems to have around a thousand such subsystems defined.)

This could work for us too? i.e. to define an unambiguous dispatching procedure to decide who of us will be upstream for an incoming PR and then have that person review/merge it and send it onwards?

I would be happy for my next branch to be the upstream for both max-next and wingo-branch-that-is-yet-to-be-named. I think that will keep me more than occupied and I would prefer not to be direct upstream for incoming PRs. (I will still check them out and comment on them, like everybody else in the community is welcome to, but I'd prefer for changes to pass through a subsystem maintainer hop before I merge them.)

lukego commented 8 years ago

Here is the list of all pull requests. I wonder what would be a useful way to start slicing them into separate topics?

One idea: new features, extensions to existing features, bug fixes.

I'd really like to have a "safe bug fixes only" branch being maintained. This would have extra advantages:

Could be merged directly onto master (bypassing next) to ship bugfix releases like 2016.01.1.
Could be browsed and cherry-picked by people maintaining long running branches that don't sync so closely with master e.g. LTS release of a Snabb based application.

lukego commented 8 years ago

@wingo one more point in this diatribe :-)

I don't know how to test my changes to make sure they are correct

This is the job of CI. PRs can include unit tests, functional tests, and performance tests. These are all automatically checked for regressions on every hop from branch to branch.

wingo commented 8 years ago

Yeah I've been trying to type something into this box for a while :) Instead, some thoughts. (1) Good that we recognize the current high patch buffering as a problem. What, for you @lukego, is the right rate at which things land in next? If I maintained a topic branch I would want to merge to next as often as possible -- if I see something as good for my branch, for me it would be fine for next too. If I didn't think it was good enough for next I wouldn't land it in mine, and particularly I would want to ask @eugeneia or you for feedback on things I'm not clear on. If I took on this role for some part of Snabb there would be a lot of this in the beginning, as I cultivate my inner snabbiness :)

Put a different way: I would always want to be opening merge requests for my topic branch to -next. I guess I'd make new branches every time I want to do this? I suppose that would be OK. If I added some additional commits to wingo-next I'd just make a new branch and PR and close the old one?

wingo commented 8 years ago

One thing I like about V8 and related projects is that you explicitly list reviewers. There they have the concept of owners and you need a LGTM from an owner to land a change, but there are a number of owners for a given piece of code, and you can also ask non-owners for additional review. I would like it if in Snabb we cultivated a culture of reaching out to particular people for review; it helps grow the set of people who are able to do review. I'm not sure how to fit that in with a culture of single owners, though.

I'm open to going ahead with this branch plan but I would like to note some skepticism for there being just one path of people that any particular patch can take to get into mainline.

wingo commented 8 years ago

I think a "safe bug fixes" branch is probably orthogonal to topic branches -- though how would a safe bug fix flow? First to master, never to next? First to master, next merges in master sometimes? First to next, then to safe bug fixes via a separate merge? Unclear to me. But OK, we can figure out something :)

I am also not sure that such a conceptual distinction as "new features" vs "extensions" is a good way to partition PRs. If anything, I would want the domain expert of an area to review both a feature and any change to it.

Right now @eugeneia owns the whole tree, and that's great :) He's a great reviewer and experienced snabb hacker. I don't want to suggest taking away any of that from him! But yet, that seems to be what you are proposing @lukego, and so I don't really know how to proceed. Please excuse again my ignorance; I am much more used to models of shared ownership :)

eugeneia commented 8 years ago

I wouldn't talk about code ownership when referring to our current workflow. E.g. Luke and I currently both feel responsible for reviewing all PRs (and a lot of people help us independently), and who takes the “lead” on the feedback of a PR is generally decided on a “first come first serve” basis. E.g. if I see a PR where Luke is already involved deeply I trust his judgement and only check for maintenance related issues (documentation, ...).

Generally speaking, I like that everyone involved in Snabb Switch seems to actively do code review. There are few if any PRs where I am the sole reviewer. I would probably be helplessly overloaded if this were not the case.

From my observation we have a rather democratic structure with a flat hierarchy. I don't “own” any code, instead I usually just moderate. When I am having doubts about a patch I voice it and usually ask for other alternative opinions, so in a way we have lots of tiny polls here and there in which the people who care about a given patch find consensus. Edit: And being a designated maintainer really just means that you have to care about all changes, no? :-)

wingo commented 8 years ago

Hmm! I guess I don't know what to say. To me if I don't have either they ability to tell someone "LGTM, go ahead and merge it" or "LGTM, I will merge it", I don't have ownership -- I can have input or whatever but it's not the same and I don't get involved in the same way, and actually I step away because I don't want to give conflicting input, unless I feel strongly about the thing. The PR about the "match" app was such a case -- I realized I didn't own the thing and that by making any feedback I was just muddying the issue, even before the PR had any other comments :)

More concretely... if we were to be dividing up PRs in some way that reflects the content of the PR and not who decides to take it first / feels they would be fine reviewing it, to me that's dividing up ownership, isn't it? Dunno. I certainly wouldn't want to get into a conflict about whether my branch is or isn't an appropriate way for a patch to enter the mainline :-P

lukego commented 8 years ago

The idea is to build a robust distributed workflow, like the internet. Each of us will be like a router that takes a packet (feature branch), checks it according to our own local rules/ACLs/etc (review), and forwards it to the appropriate next-hop (pull request).

If a router is overloaded or misconfigured or misbehaving then we work out a solution and workaround. Some sites may not be connected due to lack of peering agreements (e.g. master and partial-port-to-platform-foo) but this will improve over time.

The result should be a large system that behaves well in the presence of errors of all kinds.

lukego commented 8 years ago

@wingo Could be that we can avoid the feeling of artificial ownership in some ways...

I believe that Rust have a system of randomly assigning PRs to upstream reviewers. There's also the @mentionsbots that automatically links in other reviews.

However could just be that the term "owner" is a misnomer here. Really we are talking a person who "does the review process" including reaching out to people whose feedback is important, making sure there is enough (but not too much) time for comments from the community, etc. The usual open source maintenance work.

"Ownership" could also be reduced by dispatching on something other than filename e.g. fix vs. enhancement vs new feature.

lukego commented 8 years ago

@wingo @eugeneia one last thought for the day:

We could also steal an idea from the Rust community and have SnabbBot randomly assign a reviewer from a pool when a PR is submitted to the master branch. This would avoid the ownership issues and make it easy to become a reviewer. This could increase reviewer diversity, both in having more people doing reviews and also having each person reviewing more of the code base. I definitely agree with @wingo that we want to be as inclusive as possible and draw more people into the maintenance process.

If reviewers are newbies then their next hop could be to a more experience maintainer who can help them out.

This also fits the internet/router analogy :-). It's like ECMP (Equal Cost Multi-Path) where a router has multiple equally good next-hops and it spreads the load across them.

wingo commented 8 years ago

I like this last idea. Say, a reviewer will be randomly assigned for new tickets if no one is @-tagged in the PR description. They can be assigned by @-mention or by the assignee field; either way. Of course an assignee can pass the buck to someone they feel is more appropriate. I would be happy to maintain such a branch.

I really do lament taking up your time on these procedural things. Hopefully it will pay off :)

lukego commented 8 years ago

I really do lament taking up your time on these procedural things.

@wingo Not at all! Thank you for engaging in this discussion. We need to bootstrap a network of maintainers who work well together and know what to expect of each other.

Last month there was one upstream (@lukego), now there are two (@lukego and @eugeneia), and likely there will need to be 5-10 before it is obvious to everybody how the whole system works.

I like this last idea. Say, a reviewer will be randomly assigned for new tickets if no one is @-tagged in the PR description. They can be assigned by @-mention or by the assignee field; either way. Of course an assignee can pass the buck to someone they feel is more appropriate. I would be happy to maintain such a branch.

I like this idea too. If you want to create a branch and maintain it in this way then I think that would be cool (and would volunteer to be a reviewer :)). This should be no problem given our decentralized workflow with independent branches maintained in their own ways. The only question would be how to route changes in and out of that branch. We could start small and then grow.

The more I think about it the more I see that we are building a network like the internet.

Adding a branch is like adding a router. Making a branch feed into another branch is like running a cable between two routers. Deciding how changes should flow between branches is routing. Deciding exactly how to perform review is like deciding how to engineer one of the routers (which is a black box to most people).

know that when merged that my code would have a particular effect or not? I especially wouldn't want the person doing the merge to have to fix things up, because they might make a mistake. I would prefer them to reject the merge in that case!

Let me paint a picture of how I see this working in the Linux world...

Suppose that I am shipping a networking product based on my own branch of the Linux kernel. I decide to optimize the Intel device driver by changing the descriptor prefetch options (like #628). I test this change with my product and ship it. I also send the change to be reviewed and merged upstream:

I send a PR to the intel-driver branch. This is reviewed by one or more people who are familiar with this specific hardware. They may find a very specific bug e.g. relevant errata that I had overlooked. With luck they accept the change into their branch and take responsibility for pushing it upstream. Now my job is done: other people (kernel subsystem maintainers) will take it from here.
The intel-driver branch is PR'd to the broader ethernet-drivers branch. The maintainer here may spot something relevant: should other branches consider making this change too e.g. for Mellanox and Broadcom NICs?
The ethernet-drivers branch is PR'd to the networking branch. The maintainer here may spot something too: maybe I am making an unsafe assumption about how much memory I can allocate for my descriptors and more checking is needed.
The networking branch is PR'd to the master (Linus) branch. Linus may spot something e.g. that the change is likely only to be beneficial on a subset of the CPU architectures that this card is available for and that more testing may be needed before globally changing the default behavior.
Linus merges the code onto his branch and then it will propagate to the rest of the world: Redhat, Debian, Ubuntu, etc.

This workflow seems reasonable to me. I see value in having a chain of people who specialize in one kind of review. I see each maintainer in the chain adding value. If I had the option to bypass this chain and push my patch directly into Linus's tree I would not choose to do that instead.

One important aspect here is that I already shipped my product long before the change landed on the master branch. I also only really cared about the feedback from the first hop i.e. the NIC expert who tells me whether I have overlooked some crucial detail. Ideally I want that interaction to be fast. Beyond that I am interested in upstream for the purpose of "eventual consistency" i.e. to minimize the changes that I have to maintain on my long-running product branch.

If I were actually dependent on the feature to be merged by Linus before my customers could use it then I would be pretty frustrated -- but that is not the model. (Other people do have this problem though e.g. NIC hardware vendors who have to push drivers upstream into Linux before the hardware is debugged so that users will have a driver installed when they purchase the cards. I don't envy them.)

Hey, reflection, we could talk about this much more concretely in terms of the branch that you Igalians are already maintaining i.e. lwaftr. This branch has a well-defined purpose (developing the snabb-lwaftr program) and a smooth workflow where PRs are being created, reviewed, merged. This branch does effectively have ownership of that application i.e. this is the branch you would send a PR to extend snabb-lwaftr and this is the only branch that people would want to pull snabb-lwaftr from. The only thing missing is connectivity: there is no uplink to send changes from this branch out into the world and towards master.

How about "connecting this up to the network" and sending Pull Requests to some suitable place like next or max-next?

wingo commented 8 years ago

A clarification! You note in https://github.com/lukego/blog/issues/14 that V8 does not do merge commits: it always rebases. Same for Firefox and Safari, FWIW. When you go to land a patch, the bot that handles it will rebase onto the tip-of-tree. In https://github.com/SnabbCo/snabbswitch/issues/725#issuecomment-174184755 I expressed some concern about testing and you rightly said that CI is important here. However there are some problems which you can only detect after the fact: increased flakiness after commit X, a performance decrease after commit Y that needed manual inspection or which only occurs on platform Z, etc.

Adopting a merge-based development pattern doesn't necessarily tank this metrics-based post-commit QA approach, but it can be really negative if the branches buffer lots of commits. If I see that before merge X, things were 10% better in some way, but merge X pulled in some important patches, what do we do? Following metrics across multiple branch heads is hard on the brain, so I assume we'd only have good metrics on master. We wouldn't know before the merge that the merge had bad perf news.... Dunno. We can make it work :) But it sure would be nice if in Snabb we would be able to say "Commit XYZ slowed us down / introduced flakiness / etc. Please back it out."

lukego commented 8 years ago

@wingo Yes, we will need to deal with all of these things. I think that we should pay close attention to how the Linux kernel handles them because that is the model for the Snabb git workflow. This will be a continuous process as the Snabb universe expands and our maintenance processes need to keep up with expanding activity across more products/branches.

Please also remember that it is always frustrating to switch between tools and to find that the way you are used to solving a problem doesn't work anymore. The first impression of everybody adopting a new workflow is going to be "well, this sucks" and this in itself doesn't tell us much about how well it will work over the long term :).

(Same way I am now cursing Nix/NixOS every time I struggle to do something that I usually do with apt-get or ./configure && make install but now requires a different approach.)

wingo commented 8 years ago

Yes, I didn't mean to be too negative :) However I think we probably need to look beyond the kernel when it comes to post-commit monitoring of the performance or stability impacts of changes. My impression is that the kernel has pretty good patch review but terrible QA.

lukego commented 8 years ago

@wingo commenting on the specific example you mention: git revert is useful here. This backs out a change by applying an inverse patch. The merge commits can actually help here: when reverting you can decide on the granularity of the revert (just one commit? or the whole feature-branch merge that introduced it into the code? or one of the parent merges that propagated the change?)

This is messy to be sure but it is doable.

For example here is a case where I temporarily reverted a premature upgrade of pflua that revealed a bug in LuaJIT: https://github.com/lukego/snabbswitch/commit/9f5f1ad5cd9d45c9676ea7405f8c2eeaecb0ba36. Once LuaJIT fixed the bug and we pulled that into our repo I restored the pflua upgrade by reverting the revert: e7c8d6d1baccc1be7f6966e98d0b1a3f16e683c9.

lukego commented 8 years ago

@wingo QA and release engineering is a really interesting open topic btw.

I have a couple of different models in my head for how it could work:

Central: Snabb Switch is distributed as one snabb binary that includes all applications: packetblaster, lwaftr, nfv, alx, etc. These are all released at the same time e.g. monthly. The developers of these applications work very closely together to ensure that a new release never ships with changes that break one of the applications.

Distributed: Snabb Switch is distributed as separate binaries (snabb-nfv, snabb-lwaftr, snabb-alx, etc) that are each built from their own branch and according to their own QA processes and release schedules. The master branch exists to help application developers cooperate i.e. to synchronize their source trees and run standardized tests.

The second option seems appealing to me i.e. that each application developer is responsible for the QA and release schedule of their software, and has veto of every change that they will ship, but that we all benefit from keeping development focused on the master branch because 90% of our interests are in common. In the simplest case there would be no changes on the application branch compared with master but the owner would still decide when to make a release, what to call the release, and what QA to do beyond the upstream CI.

This workflow could also give application developers flexibility e.g. if Alex wanted to ship an ARM port of his VPLS application before other people were comfortable with merging that architecture support onto their own branches.

I am not sure how this plays out in the kernel world. I have the feeling that people don't generally consider Linus's releases to be ready for installation. I assume that people shipping products downstream (e.g. Google with Android) are doing additional QA and engaging with upstream on any problems that they find. (I am out of touch... I remember that in the old days Redhat maintained their own kernel tree and often rejected changes that Linus had accepted but maybe nowadays they are so involved in upstream that they just ship that. I'll have to have lunch with a kernel hacker some time soon to ask about how this all works nowadays.)

lukego commented 8 years ago

@wingo Tangentially...

I have found it fascinating to participate in the OpenStack upstream community for a couple of years. They use a very high-tech centralized workflow, based on Gerrit and so on, and this has become a train-wreck at scale.

The main problem I have seen in the OpenStack world is that there are many different special interest groups inside the community who are all standing on each others' toes. There is also a lot of politics (literally elections for each subsystem) and the outcome of this politicking decides who is able to ship a product to their customers.

For example in the time I was working upstream I had open hostility from the gatekeepers because I was associated with an unpopular interest group (telco) who were being pushed out by the people in charge (enterprise). If I have any problem the answer is always "my god, you telco people, you just don't understand open source, I already wasted the whole of last week talking with big telco vendors".

The general situation was that telco-motivated changes could not be merged anywhere because they were not welcome on the master branch and... no other branches exist because it is a centralized workflow. You can't even git merge the feature branches because Gerrit requires them to be constantly rebased against master while they sit in review-limbo. So there was no meaningful cooperation between any of the people with a mutual interest in working on telco because the workflow does not allow them to collectively drift out of sync for a year or two to develop their ideas and then rejoin the mainline.

I believe there was a similar situation in the Linux world with Android. The Android people put a lot of resources into development but they were very firmly focused on getting a product to market to compete with iOS. They made a bunch of changes that the upstream project refused to merge. However, it all worked out: they drifted out of sync, shipped a competitive product, and then once they were successful they put in the work to reconcile their changes with upstream and make everybody happy.

There is a related situation in the Snabb world now on a small scale. People are sending portability patches and I am declining to upstream them onto the master branch. However, I see the workflow supporting this in the same style as with Android: people who want portability can develop that on a branch, doing whatever makes sense for their immediate objectives, and then after the port is working well we can figure out how to sync it with master and other branches. This way nobody is blocking anybody else and we are all operating asynchronously.