Branching and release strategy

micahalcorn commented 6 years ago

The Problem(s)

We have been loosely following the Gitflow branching model, but we have also been merging feature branches into develop without any concern for what will or will not be included in the next release. This often results in broken integrations and/or unfinished features living in develop that are not ready for public consumption - particularly in the case of the DApp. The consequence is that, when we create a release branch (or prepare to deploy from develop), we have to carefully select points in each repo's commit history where everything works together and undo or hide some DApp code.

Since our repos don't default to develop, our project looks stale to anyone who glances at our GitHub code without digging into our branches and issues. This is a regular occurrence. If someone wants to contribute, they need to read the contributing process deeply buried in our docs to know that they should branch from develop, which is more critical than pointing a PR to develop (since we can more easily change their target than their base). This has been an ongoing source of frustration for new contributors and those of us who want to enable them.

Our Objectives

Here is a list of concerns that I'm aware of. I've listed them in descending order of priority from my perspective.

Make it simple and intuitive for partners and developers to run a working version of our project.
- In the past, this was the rationale for maintaining stable master branches separate from develop. But isn't this now accomplished (even better) by Origin Box?
Show that we are actively working and shipping code.
- This would be accomplished by setting the default branch for each repo to the one that most pull requests point to (whatever it happens to be named).
Easily create deployable releases in a matter of minutes or a couple of hours using an established and predictable process.
- We either need to be disciplined about planning release branches and only merging in code that will be included in the next release - à la Gitflow - or we need a full suite of integration tests ensuring that the default branch is always stable.
Have a seamless, low-friction process for contributing.
- This likely becomes a no-brainer if the default branch is the target branch and it is "current". An exception might be if a contributor wants to work on a feature that is in active development but has not yet been merge into the default branch.
Avoid merge conflicts.
- I tend to believe that merge conflicts are generally unavoidable, especially for a project that is as active as Origin. Increased communication and collaboration can reduce the scale of the conflicts. We shouldn't have to sacrifice 1-4 in the name of avoiding the unavoidable.
Discourage long-running feature branches.
- Smaller, more frequent commits can reduce merge conflicts and help facilitate communication. But this can be true within a feature branch and it doesn't have to apply to the default branch. We're still very much experimenting with large features that may or may not ever get released. We should collaborate and frequently commit (and push), but not necessarily to the branch that is subject to a release at any time. Token-related development is the prime case for this.
Follow open-source conventions.
- For would-be contributors who are accustomed to standard workflow used in other projects (Gitflow, GitHub Flow, GitLab Flow, etc), it would be nice for them to ease right in to a process that they are familiar with.

Possible Solution(s)

I don't know that we can perfectly check all the boxes, but here's a thought:

Continue to follow the convention of using master as the default branch.
Abandon develop and start merging to master.
Create tags for each release.
Upon each release, update origin-box to use the last stable tag.
Point all newcomers to origin-box.
Either only merge into master what will be included in the next release or build a full suite of integration tests. We can merge PRs into feature branches along the way, and they shouldn't live too long if we are doing regular releases every two weeks. There are plenty of other reasons to add integration tests too.

Please feel free to weigh in @cuongdo @franckc @crazybuster @tyleryasaka @wanderingstan @joshfraser @matthewliu (and others).

DanielVF commented 6 years ago

I'm in extremely strong agreement that we should use the "master" branch as our test-passing, latest-work branch instead of our current "develop".

wanderingstan commented 6 years ago

This is the part that gives me pause, re doing releases:

• Either only merge into master what will be included in the next release or build a full suite of integration tests.

Building a full suite of integration tests is a must, but our system is so complex with multiple interacting repos that (I think) it will be a serious project to get this built out. This will take time. In the mean while, it's pretty shaky to rely on contributors to know which commits "will be included in the next release." So we could be left in the same position of having to cherry-pick commits to be included in a release.

wanderingstan commented 6 years ago

Re docker: While I agree origin-box is the way to get people started, we should maintain a good effort to make things run-able without it.

I hate Docker now requires a user account, and that we are thus implicitly saying, "If you want to code on Origin, you need to give trackable personal information to the California company, Docker, Inc."

micahalcorn commented 6 years ago

@wanderingstan I agree that the merge solution is a bit shaky. While it doesn't necessarily resolve your concern, a point of clarification is that random contributors don't have to (but they're welcome to) know what will be included in the next release. Only admins who merge the code need to be cognizant. And the larger PRs - the ones that are likely to cause problems or be deferred until the next release - are more likely to come from core/extended team members who should be aware of when their features will be released. Also, one might make the case that we should allow any code to be merged into the default branch at anytime if it doesn't break the integration (aka is compatible with the last stable, released version of the other repos). That's actually our current intent, but it's difficult to do without integration tests.

Regarding the Docker concern, I actually think this branching/release model is still consistent with that effort. We would now be explicitly claiming that the tagged versions should work with one another. Whether someone wants the convenience of the box with Docker or they want to roll their own multiple processes, they would use the same tagged/LTS commits in each. I think it's reasonable to ask privacy-sensitive users to either deal with potential instability or checkout the right tags.

franckc commented 6 years ago

A few thoughts on top of @micahalcorn's proposal:

Could be helpful to not think in terms of releases that include a set of features. But rather favor a cadence model where every N week we deploy what is in master. If a feature is not ready, it has to wait for the next cycle to get deployed.
Code must not be landed in master if it breaks any existing functionality or test.
Having a "gating" framework to wire-on/off a feature and also to expose it to only a limited set of internal users or percentage of external users would be super helpful. Multiple benefits: developer can land code and wire it off if not yet ready for prime time (for ex. UI may not be polished) - then it's ok if that code gets deployed; can test a feature internally before enabling externally; can gradually ramp up exposure of a feature to external users. Now, things are complicated by the fact that Origin DApp is decentralized - ideally we would not use a central server for all that gating logic. Maybe we could think of ways to host the gating data in IPFS and have a gating library in origin-js that allows to check on the state of a particular gate ? Seems like a discussion we could have in a separate github issue.
There will be times where we need to deploy urgently a hot fix. We should define what this process looks like - maybe we create as needed a "hotfix" branch based on the tag of the currently deployed code, commit the fix in that branch and deploy that branch.

nick commented 6 years ago

One idea that may be worth considering is moving to a monorepo architecture, as used by React, Babel, Meteor and various other projects. I know that Google, Facebook and Twitter also use monorepos internally. Babel describes some of the pros and cons:

Pros:

Single lint, build, test and release process.
Easy to coordinate changes across modules.
Single place to report issues.
Easier to setup a development environment.
Tests across modules are run together which finds bugs that touch multiple modules easier.

Cons:

Codebase looks more intimidating.
Repo is bigger in size.
Can't npm install modules directly from GitHub

For our purposes, this would allow us to submit pull requests that touch multiple modules at the same time while keeping all tests passing. I believe it would allow us to hit all of the objectives outlined by Micah, switch to using the master branch and ease on boarding for new developers (no more npm link).

nick commented 6 years ago

To be clear, I'm not suggesting we have one single repo for everything... but we could probably bring origin-js, origin-dapp, origin-bridge and origin-box into a single repo. The npm release of origin-js would still be a standalone module.

nick commented 6 years ago

It turns out that Truffle have also recently moved to a Lerna monorepo architecture. I'm not sure we need Lerna as it's geared towards releasing multiple JS packages whereas we have just one (for now) - but interesting nonetheless.

micahalcorn commented 6 years ago

Thanks, @nick, we might actually have multiple JS packages sooner rather than later. Messaging could naturally be its own module, some JS code has already crept into the bridge repo, and there's more on the way. I suspect that we want there to be a clear delineation so that, for example, people understand that they don't have to use our bridge server. But there might be a strong case that multiple repos isn't necessary (or practical) long-term.

DanielVF commented 6 years ago

Given that most new features cut across repos, I'm a big fan of merging origin-js, origin-bridge, origin-dapp, and origin-box into a monorepo. This will make feature development so much easier.

tomlinton commented 6 years ago

I'm keen on the monorepo idea too. I think it'll be quite hard for origin-box to always produce a working development environment for contributors with the compatibility between the three different develop branches getting out of sync. There is enough dependencies between the repos to make it a worthwhile change.

joshfraser commented 6 years ago

A monorepo would certainly would make things simpler for the core team, but I'm not sure it would make things easier for the thousands of developers we hope will someday start building on top of our platform.

nick commented 6 years ago

I don't think it'd have an impact on developers building on our system as we'd still be releasing a standalone OriginJS module to npm as we do today, which will hopefully be the self-contained "jQuery" like library that anyone can drop into their project and get a nice API for a blockchain backed marketplace. That piece will remain the same for developers: use npm, download the standalone zip, or include it directly via CDN (or IPFS). We can also do standalone releases for the bridge server and demo-dapp (and other modules?).

For developers who want the bleeding edge or want to work on contracts or OriginJS itself, development could be eased significantly by a monorepo:

Master branch guaranteed to work across all sub-modules
Integration test suite for testing all sub-modules work together
Single repo to clone and report issues to
Master Readme explaining how to get started and how modules relate to each other
Helper scripts to ease cross module interactions
No more npm link confusion or errors
Lower barriers for new developers to make contributions
Single step setup: 'npm install && npm start'
Potential to break up origin-js into sub-libraries (for example messaging, or IPFS helpers)
Single repo to 'star' or fork (we could potentially rename origin-dapp to just 'origin' and get a free 268 stars)
Seemingly much more active repo as we work from the master branch.

cuongdo commented 6 years ago

Apologies for my late thoughts. I've been very heads down for the last few days.

Agree with @nick here. JS developers would would use npm. Python developers (when we have support for them) would use pip. There exists a package manager for almost every language in use with blockchain technologies.

I like the monorepo idea. It'd make things simpler for more than just the core team.

Regarding the branching & release strategy, what we did at my previous job (creating an open-source database) worked fine for relatively infrequent releases. I haven't fully thought through what it means for more frequent releases, but I'll put it here for reference:

master always contains working code and is the default branch people see
when we're ready to test and bug fix for a release, create a release branch
at this point, all PRs that go into master must be cherry picked into the release branch if they're truly needed
once the release branch looks healthy, we release
for the period of relative instability after the release, it's preferable to avoid merging major refactors into master to minimize merge conflicts when cherry picking into the release branch
over time, the prior release branches will become dormant

One if bigger disadvantages is that refactors are constrained to a relatively narrow window (after the previous release branch stabilizes and before bug fixing starts for the current release).

Regardless of what we pick, these points hold true:

We should pick some strategy fast
We will never finish tweaking this
master should be the default branch and buzzing with commit activity
Every branching strategy has tradeoffs. Let's optimize for what we think are going to be the common cases. In the beginning, I suspect that will be rapid bug fixes, continuous feature development, and generally being responsive to DApp partners.

nick commented 6 years ago

Here's a rough plan of action if we wanted to go ahead with a monorepo:

Rename the origin-dapp repo to just origin (so we get all the stars) OR start a new repo?
Create subdirectories for js, dapp, bridge and box (and possibly docs, playground, mobile and schemas)
Move the current develop branches of each of the above repos into those subdirectories
Create a new Readme with setup instructions
Update dependencies in each sub module
Switch to master branch
Add notices at the top of the old repos pointing to this new one

Going forward we can:

Create all new issues on the origin repo
Merge to master only when all tests in every repo pass successfully
Create stable branches when we do a major release (eg 0.1-stable) that we can merge in security or other critical updates to if needed

Other questions:

Do we use Lerna?
Do we move contracts into a top level directory?

micahalcorn commented 6 years ago

It seems that we have consensus on moving to a monorepo, but I think that should be step two. The branching and release strategy is somewhat of a separate and more pressing concern. Once we are ready to build integration testing, then we will have a real motivation to consolidate the codebases. Let's wrap up the discussion on the branching/merging procedures and change that workflow first.

Stan and I are planning a release for +/- next Wednesday (8/8). It will include the following:

John's DApp localization code
DApp translations from CrowdIn
Daniel's listing registry
Domen's Airbnb attestation service
Domen's profile errors
Yu Pan's messaging infrastructure and UI
... potentially whatever is currently in the various develop branches include Tyler's fractional work in origin-js (with no UI)

Please let us know if there is something else that should be included or should be reverted. ⚠️

Sometime after we complete that release next week, I will remove the develops and update THE BOX to setup from masters. Then I will formally document the new workflow somewhere and disseminate it.

Probably somewhere around the end of Q3 or before then if there is a compelling reason (and some available time), we can move to a monorepo and start writing integration tests.

Any final thoughts on how we manage what gets merged into master?

sparrowDom commented 6 years ago

We had a system that worked pretty well in the last company that I worked for. The main branch everyone worked off of was master. Main rules of committing to master:

never force push master -> this means if your branch somehow diverges, check it out of master again and cherry pick commits to a fresh branch
only merge release ready things into master

Every 14 days we had a release where a staging branch would be fast forwarded from master. QA team would setup staging servers from that branch and run all the tests (unit / integration). What could not be automatically tested was manually tested by the QA team. If release was solid the production branch was fast forwarded from staging and a release tag was made. That branch was then deployed.

Because there were not force pushes to master branch history of master, staging and production never changes and it is easier to fallback on a previous release commit in case production is broken.

In case of large feature releases we broke them down into smaller releases and deployed them in across multiple sprints. For features where that was not possible the feature branch existed as long as it was not ready to be merged in master (and ready for production).

I think if things are developed that are not ready to go into production then we shouldn't merge them into master. If multiple developers need to share a branch they can work on a feature_x branch and create extra branches out of feature_x branch and then create Pull Requests to merge those extra branches back into feature_x branch. And when work is done feature_x is then merged to master.

For all the not production ready features we wanted to showcase to others sysops in my previous company setup a script that would spin up a new AWS instance and setup the whole stack on the specified commit (in a specific branch). And everyone inside the company could access that stack and test the feature. This script was also used by the QA team when setting up staging. The fact that we used monorepo made this process easier.

micahalcorn commented 6 years ago

After seeing what the onboarding experience is like cloning from tags rather than branches, we've decided to create stable branches to use as reference points for cloning and hotfixing.

micahalcorn commented 6 years ago

Here is what we've ended up with (basically Git Flow with different branch names): origin git branching model

tyleryasaka commented 6 years ago

I'm new to the monorepo concept but I quickly skimmed google for existing opinions and projects to get up to speed.

I'm warming up to the idea of a monorepo. I do see how it could really make it easier for us to sync work across highly interdependent repos. It does seem like the pros outweigh the cons at the moment.

I will say that the open source projects that @nick mentioned as using monorepos seem, from what I can tell, to be entirely javascript-based projects. As opposed to having a server in ruby on rails and a client in react, for example.

I'll throw my support behind the monorepo move with the additional request that we eventually try to move all of our code in this repo to a single language and package manager (node and npm). So we'd need to move the bridge server from python to node (something that I think is already a great idea for a number of other reasons). And once that is done, I think lerna could be a solution worth exploring.

I think if we switch to a monorepo and then move the bridge server to node, we'll be set up nicely for a clean package structure and smooth development process.

micahalcorn commented 6 years ago

Amen to more JS 🙏

tomlinton commented 6 years ago

More JS :cry:

I agree it is a good idea to have a single language - and we are quite close. Soon origin-bridge will only be responsible for attestations, and the only other Python floating around is the pinner which isn't too much code. It might even be something worth doing prior to a move to a monorepo.

franckc commented 6 years ago

Agreed that at the current stage of the company, for team velocity it makes sense to have JS as our primary language.

For the future though, I don't think mono-repo is incompatible with using multi-languages. At my previous job we used 6 languages (python, Go, Rust, JS, Java, Swift) and the codebase for those projects co-existed happily in the same mono-repo. If later on we decide to support another language in addition to JS, we can make it happen. My 2 francs :)

tyleryasaka commented 6 years ago

Hmm ok, maybe my fears aren't well founded then. I've never used a mono-repo structure so I'll defer to those with experience.

joshfraser commented 6 years ago

I believe we also have a JavaScript version of most of our attestations thanks for @nick's original implementation of that.

micahalcorn commented 6 years ago

Closing, as the develop branches have been merged and deleted 🔥 But feel free to continue this discussion here or in #30.

micahalcorn commented 6 years ago

@tomlinton and I agreed to keep a long-running staging branch in place for continuous integration to watch. This will tentatively take the place of the release branch, but the future might hold a scenario wherein we have multiple release candidate branches at one time and merge them into staging.

Chealer commented 4 years ago

Here is what we've ended up with (basically Git Flow with different branch names):

@micahalcorn I would appreciate if you could specify the main branch renames relative to Git Flow.

micahalcorn commented 4 years ago

master became stable, develop became master and we added staging as an intermittent branch between those two. 🎋

Chealer commented 4 years ago

Thank you @micahalcorn

OriginProtocol / origin-devops