Git Flow Structure

This issue relates to this projects "Git Flow." By this, I'm talking about how we use git (locally) in the following ways:

When do we create branches?
How do we name new branches?
When do we create tags?
How do we name tags?

Related to this is our interactions on GitHub (our "Github Flow"). These interactions are closely related to our Git Flow, but happen through GitHub instead of locally via a Git Client or the Command line. The following items are related to our GitHub Flow:

When do we open Pull Requests?
What types of PRs do we have?
What branch do PRs merge to?
What strategy should be used to merge PRs (e.g. merge, squash+merge, rebase, etc.)?
When do we create Milestones?
How are issues/PRs added to milestones?
How are labels used?
How are projects used?
When do we create releases?

An argument could be made that since these two "flows" are so closely related, they should be considered one and the same. Indeed, these flows fit underneath a shared "umbrella," but that umbrella also includes other "flows" such as a Deployment flow, CI/Testing Flow, QA Flow, etc. These flows we have yet to nail down/implement, so I won't be talking about them in this issue. I'm deciding to make the distinction here to reduce the scope/complexity of what I'm referencing.

Current Git/GitHub Flow

I haven't found any reference to what we are currently doing with regards to our branching/tagging structure, but I know we've talked about it before. Because of this, I want to write it down here.

Git Flow

Currently we create a new branch for each issue we track on GitHub and name the branch accordingly (issue_<x>, where <x> is the GitHub issue id). This is a standard branching practice and I think it works well because it allows us to create a targeted group of commits all related to a single issue. Further the name of the branch is concise, yet points us to Github Issues -- a better place to describe/track information related to the group of commits on this branch.

NOTE: We do have a couple of "types" of issues to create (Feature Request or Bug Report). However, this is a distinction made on the GitHub Level -- from Git's perspective, both of these are "topics" of changes, which leads to only one type of branch, a "topic" branch.

I don't think we have any "set" way of creating tags, but they are created for when new versions are released/deployed. Creating tags at the correct position in our commit history is very important for creating Releases (in our Github Flow) as it provides us with a easy way to checkout and load a specific version of Nebula. This is important for historical purposes, but also has practical uses (e.g. tracking a bug across multiple versions).

GitHub Flow

Pull Requests

Currently we open PRs when we have an issue branch finished or close to being finished (a WIP PR). PRs allow developers to get feedback on the changes (through GitHub's Review system) as well as provide a visualization of the scope of code changes they are proposing. Because we only have topic branches, there is only one "type" of PR -- This represents any proposed changes we have. Similarly, we only have one "stable" branch (master), so all proposed PRs merge to the master branch.

Depending on the complexity of the issue a PR attempts to address, we can have as little as 1 commit or as many as 100+ commits. This variance can cause a drastic jump in the commit history when PRs get merged into the master branch, but they don't necessarily provide a benefit due to commits having unhelpful commit messages or containing unhelpful changes (e.g. fixing formatting/linter errors). Because of this, the "Squash and Merge" strategy is used to consolidate all commits down to 1 before they get added to the master branch. Github places all individual commit messages in the body of the single squashed commit, so the content remains, but developers are saved from looking through potentially long, unhelpful chains of commits when looking through the commit history.

Since PRs are triggered from issue branches, they can include a piece of text in the description to automatically close the issue it addresses when it gets merged.

Milestones

GitHub Milestones are created as a way to track progress for a specific version release. Any issues/PRs that will be a part of the release should be added to the corresponding milestone. When all issues/PRs for a specific milestone are addressed, we know that we can release a new version and close the Milestone.

Sometimes we decide that certain issues do not need to be addressed for a certain release. If this is the case, we move the issue/PR out of the Milestone and add it to the next version's Milestone.

Labels

Labels offer a wide range of benefits, but are used to aid in classifying issues. When dealing with Issues and PRs, we often have the following questions: What type of issue is this? What area is related to this issue? Is there a specific focus for this issue? What priority is this issue? Labels help to answer these questions (and more) at a glance. Further, labels offer a way make issues easy to search (i.e. search for all issues that have a specific label).

Because labels help classify many properties of an issue, all labels have a "group" that they belong to. To help show this, all labels have the following naming scheme: <group name>:<value> (e.g. status:onhold, area:frontend, type:bug, priority:urgent, etc.). This allows a developer to quickly see and search all properties of an issue without having to view the Issue's detail page.

Projects

Because of the Monorepo structure, it is necessary to have a grouping of issues based on what project they relate to, as well as a measure of progress for each issue. Labels offer us an easy way to group issues by project, but the "progress" of an issue is harder to represent with labels. Instead, Projects offer a way to group issues and visualize the progress of an issue.

A project exists for each sub project of the mono repo. When created (or right after creation). Issues are assigned a project. Multiple projects can be assigned, but this is discouraged since that opens an issue up to having multiple "levels" of progress (something we want to avoid). If an issue relates to multiple projects, it mostly likely should be broken up into smaller issues that can focus on a specific project.

Projects have all been setup with Automation. This allows issues to be automatically triaged when added to the project (Issues are placed into the "Backlog" column). A developer can manually move an Issue to the "To do" column to represent that an issue is a higher priority than issues in the backlog. Issues must then be manually moved to the "In Progress" column to represent that an issues is being worked on. When a PR is opened, It gets automatically triaged to the "Needs Review" column. PRs are then automatically moved to the "Reviewer Approved" column when reviews are complete, but before a PR is merged. When the PR gets merged, both the PR and the related issue are placed in the "Done" column. This set of columns provides a snapshot of progress for each issue.

NOTE: The term "progress" in this context is kind of ambiguous. I'm using it in reference to the question "What stage is this issue/PR in?" as opposed to "How close are we to addressing this issue?" Issues can vary in their scope, so the latter question is hard to define/visualize compared to the former.

Proposed Changes to the Git/GitHub Flow

Change to our Branching Structure

One big change I think we need to make is switching to a multi-level branching system. We currently have 2 "levels" of branches. The first is the master branch which represents the most stable line of development. The second is our "topic" branches (created for each issue). As stated earlier, these branches represent a group of commits related to a specific topic (feature or bug). This 2-level system is relatively stable, but does have some caveats:

The master branch does not always represent a "shippable" state. This is especially true for the monorepo structure we currently use. Multiple issues (spanning different subprojects) might need to be solved in order to fully implement a feature, which means that we need sometimes multiple branches merged into the master before it becomes "shippable" again.
The topic branches might sometimes need to pull in other topic branches. This creates a potentially messy dependency of how we need to merge PRs into the master branch (this is due in part to the merge strategy we use because rebasing doesn't always work as expected).

Instead, I think we need to move to a 3-level system:

The master branch - This branch is always stable and the HEAD of this branch will always be in a "shippable" state.
version branches - These branches represent the semi-stable level of our current master branch. the HEAD of this branch doesn't have to always be stable, which allows multi-issue bugs/features to merge at different times without having to carefully plan out the order of merging.
topic branches - These branches are the same as the topic branches we currently use.

With this new system, topic branches will merge into the version branches and version branches into the master branch. As stated above, this new system allows us to merge multiple topic branches without having to worry about affecting the stability of the master branch. Since we merge into the version branch, we can make separate follow-up PRs to the version branches that fix stability issues before merging to master.

Another benefit to adding this version level branch is the ability to merge a topic branch to a specific version. This can help us incrementally add a new feature to the next major release (the 1.0.0 branch) without affecting a smaller release (the 1.0.0-beta.3 branch). This type of flexibility will help us manage the balance of working on different levels of issues (large new features compared to urgent patches) without worrying about accidentally releasing broken, in-progress features.

We can also use this to help filter out redundant bugs. If a new issue gets created that describes a bug in a patch version (1.0.1), we can checkout the 1.1.0 branch and see if the issue is already resolved. If it is, we can elect to close the issue if 1.1.0 will be released soon instead of working toward including it in another patch release (1.0.2).

This change affects several items in the GitHub Flow, which I'll describe below:

Changes to PRs

Because we will have two types of branches, we need to have two types of PRs:

Topic PRs - These are the current PRs we have now. topic branches are merged into a version branch and the same checks we have apply (coverage, CI passes, manual checks). This PR will use the same "Squash and Merge" strategy as before as well to produce a clean commit history for a version branch
Release PRs - These PRs will be a new type that represent us "releasing" a new version. This involves merging a version branch into master. Since we already tested new features/bug fixes in the Topic PRs, we should already have good coverage, passing CI, and targeted manual checks. This PR should include larger checks and will offer a chance for developers to discuss the aspects of the release before it actually hits the master branch. There may be missing issues that still need to get addressed (the PR will be closed and reopened when the requisite topic PRs are merged). Release PRs can be merged using the regular "Merge" strategy since the commit history of the version branch is clean and concise.

Changes to Milestones

Milestones should be unaffected, but it is important to note that the version branch will be in sync with the progress of the Milestone. This means that it will be easy to tell when a Release PR can be opened.

Changes to Labels

The introduction of the version branches means we can start to classify issues according to a "semver" level (major, minor, patch). Depending on the complexity of an issue, we can assign a semver label to it. This can then be used by the developer to know where Topic PR should merge (e.g. a minor level issue should be merged to the 1.2.0 branch and should not be merged to the 1.1.2 branch).

Changes to Releasing/Tagging

With the change to having Release PRs, we can also nail down our flow for creating new releases/tags. Assuming the master branch passes CI, a new tag can be made on the merge commit and a new release can be drafted from there. When the deployment flow is finalized/implemented, it might be possible to detect a Release PR getting merged and a deployment pipeline that generates a change log, updates the deployment endpoint, and sends a message to our discord.

Change to our Project Management

The automation for Projects is great, but the initial columns for projects require manual interactions that we currently aren't doing. I'm proposing we do the following on a regular interval (every 1-2 weeks) for each project:

Look through the backlog and choose issues that could be accomplished in the upcoming interval (sprint). These issues should be moved to the "To do" column so we know which issues to take on next after the in-progress issues are complete.
Look through the backlog and check for issues that are redundant (either duplicates or no longer relevant) and close these issues.

These "housekeeping" tasks allow the projects to keep a relatively accurate snapshot of our progress over time. We can still add urgent issues to the "To do"/"In Progress" columns when they are created, but like any machine, regular maintenance is needed to make sure everything runs smoothly.

Summary (TL;DR)

We don't have a single place that tells developers how to accomplish git management and project management tasks, which I've now described (or attempted to at least 😅). Further, there are several changes I'm proposing we make to keep the project manageable as we scale up.

If there are any questions/more proposed changes, please add them below!

walmat / nebula-old

Clarify Nebula's Git Flow #248