Discussion: revisit branching and continuous delivery workflow #1517

Closed jmcarp closed 8 years ago

Our repos currently use the git-flow branching workflow, which we integrate with continuous delivery. Successful builds on the develop branch are automatically deployed to the development server, on the master branch to staging, and tagged commits to production. We initially settled on this workflow because it's familiar to many developers and includes good tooling, but it's becoming increasingly clear that the model is confusing for people on both the 18F and FEC teams. I'd like to discuss alternative branching and deployment workflows, with the goal of finding a pattern that makes sense to everyone on the team. Specifically, I'm not convinced that the complexity of git-flow is justified for our applications.

As specific alternatives, I'd like to consider GitHub Flow and GitLab Flow, both of which are simpler than git-flow but might suit our needs at least as well. Both of these workflows are described in this article, which I recommend at least skimming if you want to weigh in: http://doc.gitlab.com/ee/workflow/gitlab_flow.html.

cc @noahmanger @LindsayYoung @ccostino. And, now that we have the author of Continuous Delivery on the 18F team, flagging @jezhumble in case you're interested and want to advise.

Hi! Since you asked, here's what I think, although with the caveat that it's based on git-flow as canonically defined, and I don't know how your team has implemented it; practice > theory.

I don't consider git-flow to be continuous integration, let alone continuous delivery, for the simple reason that continuous integration is defined as all developers merging into master on at least a daily basis. Indeed, allowing work to remain on feature branches unmerged into master for any length of time is pretty much the opposite of "flow". The amount of time you are going to spend engaged in re-work to deal with merge issues, and generally futzing with version control, will increase non-linearly with both the number of branches and the length of time they remain unmerged. Git-flow is basically the old-school ClearCase workflow but with much better tooling (git) which take it from utterly unmanageable to bearable (but far from ideal).

If you're working on a project with community contributions, there are good reasons to deviate from this, in which case I think the GitHub model is OK, although I still think feature branches should be kept short (less than a day's work), because when there is a long diff between a feature branch and master it becomes hard to reason about its impact, and this slows down code review (IMHO code reviews should happen in-process, not asynchronously). I have not checked out GitLab flow.

I am a big fan of keeping your version control process as simple as possible, and merging onto master as frequently as possible, for the simple reason that version control is, first and foremost, a communication mechanism for the team. Keeping work off master inhibits this communication. While GitHub provides mechanisms which somewhat alleviate the communication overhead generated by feature branches, it doesn't remove them, but (as I've mentioned) it is a good choice when you have community contributions for your codebase. If you're using a private repo, the optimal solution is for everyone to work off master.

Just my 2c :-)

Also, as a note, I don't want to offend anyone - I will say I admire any team that makes git-flow work for them, in the same way that I admire any team that make extensive use of operator overloading and multiple inheritance in C++, although I would be a bit scared about being on those teams.

Thanks for your thoughts @jezhumble! For clarity (I hope), maybe I should replace "continuous delivery" with something more generic, like "automated deployment policy". My goal here is to figure out a workflow for deploying code that automates as much as possible and provides transparency to the teams at 18F and FEC about what code lives where. Because FEC often needs to review a feature carefully over days or weeks, I'm not sure that GitHub-style multiple deploys to master per day will be acceptable here--as far as I can tell, it's going to be hard to get away from the concept of a release. We should be able to deploy more often if we used feature flags to disable features that haven't been fully vetted, but we don't have feature flags yet and would need to implement them across a few repos for this to be feasible. As an aside, I would guess that many client-facing projects at 18F are in the same boat.

I'd be happy to talk about this here, or to set up a (short) synchronous conversation. Let me know what you all prefer.

I am fine with continuing the discussion here and/or having a synchronous discussion! I will say that I wholeheartedly agree with you @jezhumble about keeping your version control workflow as simple as possible* (see my note below), and your desire to automate as much as possible @jmcarp.

Given the constraints of the project and the need for reviews/oversight, perhaps we can start with the simplest/ideal path and work in the extra bits we need until we account for all of the needs of the project and stakeholders while keeping things as streamlined as possible. That should be a lot easier than starting from the other side and whittling things down.

Additional notes and background:

We had a very convoluted process at a previous job of mine that was a combination of git-flow worked on top of the pre-existing steps based on CVS for actual version control paired with a very waterfall-heavy approach to work being shepherded through to eventual releases (this included hot-fix branches on top of integration and QA branches). This was borne out of shoehorning things on top of a very old process used for a legacy system being managed by stakeholders that were unwilling to change the flow of development (nor willing to give control of development to the actual development team). The details of this experience (and the initial move to Git, which failed spectacularly and ended with us cherry-picking an entire release's worth of work across two branches that were derived from separate trunks) are best shared over everyone's beverage(s) of choice.

At another job, we had used a workflow similar to one of the ones described in the GitLab-flow write-up, where we had 3 main branches to deal with: develop (development environment), staging (staging/QA environment), and master (production environment). Our actual work was always done in feature or bug branches that were then submitted into Gerrit first for code review. Anything that passed review would be merged into develop, which would eventually make its way into staging for release prep via a fast-forward merge (we worked in a fashion where we never had separate merge-only commits). Anything found wrong in staging by our stakeholders would first make its way back into develop via the same development process as initial feature development. Once everyone was comfortable with the state of staging, staging would be fast-forward merged into master and we'd do a blue/green swap of our staging and production environments. We had some semblance of CB (continuous building) with Jenkins that would trigger builds in isolation related to anything pushed to Gerrit, but as mentioned above it wasn't full CI; our production deploys were still a manual process of bringing master up-to-date and initiating the switch (which was thankfully handled with scripts that took care of most everything once the prep was done).

On additional note related to my last note above and your mention of feature flags, @jmcarp, was that we had also discussed the notion of feature flags, but we never got around to implementing them completely either; the nature of accounting for both server-side (mostly API-related) components and client-side components (it was a Backbone app) made this problem inherently complicated. However, I think we did work out something to the point where the API itself was returning the site configuration as a part of the app initialization, which included feature flags that the client-side app would then utilize to enable/disable features. I believe we only had this working with one or two features though, certainly not the entire collection of them.

I'll preface this by saying that I feel like a total n00b at this stuff, so my understanding of the relative implications of everything might not be great. Also this ended up being a way bigger brain dump than I expected, but it helped me process through it. Ok. </disclaimer>.

So I've been asking around on this and doing some thinking, and I think the main challenge we have is that we have features in development for drastically different lengths of time, which raises two to separate-but-related problems:

Problem 1: How does the `develop` branch function?

As long as we're always trying to have develop be in a state where it can be promoted to a release that can then be merged into master, we're never going to want to merge unfinished work into it. This means that feature branches stay unmerged for weeks, which leads to both potential merge conflicts, as well as confusion from trying to manage the deployment of multiple feature branches (more on that below).

Potential solutions

I could see how introducing feature flags could help with this (e.g. we could have set a feature flag for the calendar that would have allowed us to merge in unfinished work without worrying about it being deployed prematurely). But I gather the downside here is that these can be tough to manage and implement for features that are less self-contained.

I could also see how a different branching model could help. For example, maybe we use the GitLab flow and always branch off master, and then just use feature branches on our dev server. The advantage seems to be that we have one fewer branch to worry about, but maybe the disadvantage that then what's on production won't always exactly reflect the current state of master, which seems like it could get confusing. Though maybe this is a place where version tags would come in handy?

Problem 2: Need multiple feature branches visible simultaneously

I think the biggest snag we've been running into is that because we're keeping work in feature branches for extended periods, it's hard for view work on a public URL for stakeholder review or usability testing. Because I don't think we'll be able to speed up review processes, I think we can expect this need to remain.

Potential solutions

Maybe solving the first problem solves this one automatically. But maybe not. In that case, I think we'd want to figure out a system where we can easily spin up multiple dev instances for a given feature branch. Not every feature branch would need this; maybe just any that will require stakeholder feedback or usability testing.

In the case of front-end feature branches, we'd want to point to a specific instance of the API and version of the style repo.

This could get unwieldy, but I think we could have a simple living document or integration with Slackbot that would give us an easy way to check which feature branch is deployed to which URL.

A few practical constraints. Setting up multiple development instances would require that either (a) all development instances share the same RDS instance, which will break things if competing instances do conflicting things to the database, or (b) each instance gets its own RDS, which would require some work to automate and take time to spin up. Also, we're dealing with a bunch of separate applications that have to work together (API, task queue, front-end, CMS, proxy, plus styles)--would we spin up a batch of all relevant applications for each development instance? Even the idea of "front-end branches" might break down quickly--many of our features tie together pull requests from two or three repos.

Regardless of where the git workflow winds up, a simpler solution might be to stick with three deployment environments and turn off automated deployments to the development environment. Since the current development environment isn't tied to a particular branch (it's theoretically linked to develop, but practically linked to whatever we want to show at any given time), automation seems to hurt more than it helps.

I think the real issue here is breaking our features into smaller pieces, that can be reviewed and merged more quickly. Looking back, it would have been nice to have a calendar MVP that did not have downloads and filterable events and then, we can add those features in the next releases. Getting them merged as quickly as possible decreases the chance of merge conflicts and having branches that can't be shown at the same time.

If we need two things deployed, we can use dev and staging. I am not opposed to turning off the auto deploys if we thing that would help. But I don't think we should add more sites, having even more moving parts and more places to look. It think it would eventually get messy and cause the confusion it was meant to avoid.

I am not a huge git flow fan but I am finally in the habit of git flow. I would only want to change to something that was super easy and intuitive, or had some great benefit to us. I am a little skeptical that other systems would be better for us rather than just having a different set of trade offs.

Also, it might be good to balance testing things at different stages of development. I feel like we test almost all things when they are first being developed, but have less of a chance to iterate on testing older features or higher level flow of the site. Perhaps we can do some testing earlier with light-weight things like sketches etc. to do gut checking and then test it again later when it is more mature. I also am not in most of the testing and this is not my expertise, so feel free to tell me I am off base on this.

I think everyone can agree that merging sooner is better, but I don't think development time has been the main impediment to merging so far. For example, it took about two months to close the calendar pull request on fec-cms, but implementing downloads only took a few hours. My guess is that much of the delay on merging comes from waiting for usability testing (which happens weekly), or for client feedback.

Ah I see. Thanks for explaining. I agree that smaller features / PRs are better, and that dev time hasn't been the main constraint. (Also +1 to increasing usability testing at different stages and expanding the type of things we test, but that's a conversation for another day :) )

Smaller PRs will help get the code merged in faster, but as long as we have a long review process for the overall set of features and don't want to expose it to the public until that's been resolved, I think we'd still run into the same issues. So the next time we have a major feature like the calendar — a feature that for whatever reason we don't want to release until it has several features completed — I'd be interested in experimenting with feature flags so that we can get that code merged in faster but not worry about exposing it prematurely. (Again, not my expertise, so let me know if I'm off-base).

At the very least, for now it sounds to me like turning off automatic deploys to develop is a good place to start.

Feature flags are easier to implement than you think. For example, my team implemented an admin mode for our product that's only accessible to team members and trusted testers. New features are only visible in admin mode. Once we're confident that a new feature works, we remove the code that hides it from non-admins. Our releases go more smoothly now that they don't have to wait for half-working features to be fixed.

Thanks for the input, @evankroske .

Just talked with everyone on call, and here's what we want to try going forward:

When beginning work on a new feature: Be more deliberate in breaking it down to tiniest pieces and evaluating which parts will need data and content review that could take a while. Strategize which features will be included in which release, with an eye towards keeping releases as small as possible.
When working on a feature: Merge into develop as often as possible, which means that additions that may pop up in code review should be handled separately. Code reviews should catch things that are wrong. Partners and other reviewers can be consulted during this phase.
Develop branch -> Develop environment: As much as possible, we're going to try to keep the develop instance just using the develop branch. If the circumstances require deploying a feature branch to the dev server, we can do that, but we're going to try to avoid it.
Release branches -> Staging environment: We're going to revive our staging instance as the place where thorough partner and QA review happens. As issues are found on staging, we'll determine on a case-by-case basis if they should hold up the whole release or not. If they're not too complicated and don't take too much time, we'll just fix them. If they're complicated, but are easily decoupled from the rest, we can use a feature flag or some other method to disable the feature so we can continue to work on it while still releasing the rest of the features.

Thanks again for thinking through this. Let's check in after our next release to see how this goes.

Am I missing anything?

This looks good, @noahmanger, thank you for capturing all of this!

@evankroske, thanks for chiming in regarding the feature flags as well! If possible, could you please elaborate a bit more on how you and your team have accomplished implementing them with an admin mode? One of the challenges we face (and I have faced elsewhere as well) is that our code is split across multiple repositories comprised of different components of the system: the underlying API; most of our CSS and JS separated out from everything else; and the client-facing web site. There is also the CMS side of the client-facing web site as well.

In theory these components wouldn't be tightly coupled, but the reality is that they are, which means that a lot of the things we work on requires changes to happen in tandem across most or all of these repos. These changes may also have an impact on existing functionality in the site; not all features are independent of one-another and therefore they cannot be enabled/disabled in isolation. I'm curious if you and/or your team have experienced this as well and if so, if you can speak to how you've accounted for this complexity.

Ideally we'd love to be able to do what you describe and have our features be independent of each other to make it simple to enable/disable them. However, there are other constraints that we must account for that make it difficult to put into practice.

If I were applying my team's approach to the openFEC web app, here's what my implementation plan would look like:

Add a login system. Django might help with this.
Add a parameter to every page that enables admin mode. If an anonymous user tries to enable admin mode, give them a 403.
Hide all incomplete features unless admin mode is enabled. In JavaScript, you can check window.location. In Python, you can check the parameter from the request.

Every incomplete feature is visible in admin mode. With this approach, you can't show features to a custom subset of users or perform an A/B test, but it's easy to implement and useful on its own. Implementing this solution across the web app and the API would be more complex, but I think it would be valuable even if it were only implemented in the web app.

Sorry, I meant to say that Flask might help with the login system. I know that this project doesn't use Django.

Awesome, thank you for sharing your thoughts and insight, @evankroske! I like the light-weight and straightforward approach to this (though we may still face challenges with some feature interdependencies in this particular project). We'll see what we can do moving forward!

And no worries about the Flask/Django references, totally understood. :-)

Excellent. The more I think about this the easier I think it may end up being to do, as generally the concern with releasing a new page or major feature is in about making a new interface element visible to the public, which seems like it could be controlled entirely via frontend feature flags.

I think we have a good set of actionable steps to take after this discussion, so I'm going to close it out. If anything pops up and we want to revisit, we can reopen in the future.

fecgov / openFEC