Git branching - Githubissues

kirlat commented 4 years ago

With some tooling for CI/CD available (Travis CI, for example) we need to make some decisions on how to organize our code flows to automate integration and deployment.

Travis CI has an option of deploying a build automatically once a PR is merged into one of the branches. So if we set branches for development, QA, and production we can automate a deployment of corresponding builds into designated locations.

What if we use the following branches:

master - for development work. All current commits goes to master. Usually master is reserved for production but, on the other hand, it is the most "natural" place to merge the code; development work generates the most commits. If somebody new will commit to master by mistake it will make no harm. However, this can be confusing for some people and that's why some developers even not use master at all for that reason alone. Here is an interesting discussion: https://softwareengineering.stackexchange.com/questions/278686/having-a-production-branch-or-using-master/278687
qa for testing in QA
staging (if we need it at all). Commits merged there will go into integration testing and, probably, can be deployed into production for limited number of users production - For production releases.

The path that code follows is: (master/development) -> (qa) -> (staging) -> (production).

When a PR is merged into one of those branches, it triggers Travis CI to deploy a code to a specific location. For Amazon S3 we, for examples, can have buckets for development, QA, staging, and production and the PR merge into one of the branches will trigger a code deployment into one of those buckets.

Those branches would be the same across all repositories. That will simplify the process of understanding the role of each commit.

What do you think about a system like that? I think we shall try to find the simplest solution that will satisfy our requirements and that one seems to be simple enough.

irina060981 commented 4 years ago

master - for development work. All current commits goes to master. Usually master is reserved for production but, on the other hand, it is the most "natural" place to merge the code; development work generates the most commits.

I didn't fully understand what would be the differencre with the current workflow. Do you suggest as before - create PR, approve and merge to master. And then to create branches from it for qa and production.

Aren't we doing it exactly right now?

Or may be I missed something.

kirlat commented 4 years ago

Do you suggest as before - create PR, approve and merge to master. And then to create branches from it for qa and production.

What I suggest is very similar to what we do now (which is not a bad thing, on my opinion), but with some minor changes. Right now, I believe, we do not have longstanding branches for QA, production, etc, but rather an ad-hoc ones. We have branches for specific releases that live until the release is deployed. We could keep this practice with ad-hoc release branches (it is convenient, I think), but at the end those must be merged into production. And that merge will trigger an automatic deployment.

So with this we should formalize the following:

The names of specific longstanding branches that we will use across all repositories.
The fact that a merge of a PR into one of those branches will always trigger an automatic code deployment to a specific location.
Maybe decide on some other practices we can use (production release branches, feature branches, etc.).

@balmas, you probably have a better understanding of what our needs here are. There might be specific case I'm not aware of.

balmas commented 4 years ago

I like this idea. Right now the procedure we have been (mostly) following is this:

master is for development (branch from master for bug fixes, features and merge back into master upon code review)
a "QA" build is a tag on the master branch
when we get close to release, create a branch for release
when ready to release, tag the branch
any fixes in the release branch need to be merged via PRs back to master so they don't get lost

So, with the revised workflow, it would look like

master is for development (branch from master for bug fixes, features and merge back into master upon code review)
branch for QA instead of tag
when we get close to release, create a branch for release
when we are ready to release (1) merge release branch into production and tag the commit with release (2) merge production back to master (to get any fixes from the release branch)

I think we won't be able to switch to automated CI/CD for everything at once, and the strategy for that might be different depending upon which type of code it is (supporting library, microservice, user-facing application, etc.) but I can't think of any problems with starting to apply this Git workflow across the board, regardless of how we package and deploy the code.

kirlat commented 4 years ago

It seems that the suggested workflow can satisfy our requirements for at least some repositories. We can probably start implementing it and see if it will provide us all that we expect form it.

balmas commented 4 years ago

yes I agree.

kirlat commented 4 years ago

@balmas, @irina060981: I've created diagrams for two git workflows that we might have. Please click on an image to see it in larger scale.

This is a standard workflow that we, as I think, would use on a regular basis.

Fast track workflow This is a fast-track workflow that can be used if we need to push some updates before anything else. It is much more complex in merging the fast-tracked changes back properly but I think it is still an option.

What is your opinion on those? Do these diagrams reflect the process correctly? Do they miss anything? What are your thoughts?

balmas commented 4 years ago

Some questions:

Is the following correct: in the fast track workflow, when fast-tracking feature C to QA we skip the merge back to master of the feature C branch and move merge right from the dev branch to qa?
Why do does the fast track example jump the QA version for the build with feature to 2.0 where as in the standard workflow it's a minor version change (1.3.1 -> 1.3.2) ?
I think we might need to increase the 4th component of the version numbers every time we commit a change we want to test. Because what I was seeing today was that lerna will not update the packages from GitHub on new commits - the version number has to change. So if I want to test a package that is outside of alpheios-core, if it's version number doesn't increase, I can't get it down from GitHub.

kirlat commented 4 years ago

1. Is the following correct:  in the fast track workflow, when fast-tracking feature C to QA we skip the merge back to master of the feature C branch and move merge right from the dev branch to qa?

Here is what I thought about that: When we complete development of a fast-track feature C, we merge it master, but we do not merge master to a feature C branch. The reason is that by the time development of C is completed master may have other features merged, A and B. If we merge master to C then C will get changes from A and B. But we don't want them to be tested with C, we want C bypass A and B and go before them (if I understand things correctly). That's why we do not need to merge master to C. I think in a fast-track schema feature C is special and should be isolated from other features until merged to release. Does it make sense?

kirlat commented 4 years ago

2. Why do does the fast track example jump the QA version for the build with feature to 2.0 where as in the standard workflow it's a minor version change (1.3.1 -> 1.3.2) ?

That's because C is a major feature and thus it should increase version to 2.0.0

kirlat commented 4 years ago

3. I think we might need to increase the 4th component of the version numbers every time we commit a change we want to test. Because what I was seeing today was that lerna will not update the packages from GitHub on new commits - the version number has to change. So if I want to test a package that is outside of alpheios-core, if it's version number doesn't increase, I can't get it down from GitHub.

That makes sense to me. I think if we're testing locally we an use filesystem dependencies that will be updated automatically but if we want a version to be consumed by something over the GitHub then a prerelease version needs to be increased: 1.2.3-dev.1 -> 1.2.3->dev.2. Prerelease version does not have much meaning other than the separation of one build from the other and we can change it freely.

If we decide to to increase prerelease versions during feature development both feature A and B will start with the same version that in the example workflow is 1.2.3-dev.3. So if each feature branch will increase a prerelease version in its own way versions may overlap: both feature branches A and B might have version 1.2.3-dev.5. To avoid that, we can use ranges: feature A can start with 100 and feature B can start from 200. Then feature A versions will be 1.2.3-dev.100, 101, 102, ... and feature B versions will be 1.2.3-dev.200, 201, 202, ...

The more elegant solution would be to use prerelease postfixes that will specify the feature (1.2.3-feature-a.1 ad 1.2.3-featue-b.1) but that will be harder to automate because we have to specify prefix in a unique way during a start of development of a new feature.

But maybe, even if versions overlap, but branches (tags) are different, Lerna will be able to pull updates regardless. I will experiment to see how it works exactly.

kirlat commented 4 years ago

As lerna bootstrap documentation says, lerna bootstrap uses npm install, so whatever npm install will decide to install will be installed or updated. If package is located within a remote repository (which GitHub is) it will be referenced in package.json as github:alpheios-project/project-name#<commit-ish>, where is a branch or a tag. So, I think, even if the version within package.json stays the same but a branch or a tag is changed, npm install will pull and install an updated package.

I did a test with fixtures: did not increase it's package version but created a new branch for it, specified that updated branch in dependencies of client-adapters and run lerna bootstrap. As a result, an updated fixtures package was installed for client-adapters. Lerna also noticed a commit-ish change: EHOIST_PKG_VERSION "alpheios-client-adapters" package depends on alpheios-fixtures@github:alpheios-project/fixtures#1.2.0-dev.1, which differs from the hoisted alpheios-fixtures@github:alpheios-project/fixtures.

So with all this updating version number within package.json may not be necessary. @balmas, can you please test if it works for you when you will have time. The difference between fixtures#v1.2.0-dev.0 and fixtures#1.2.0-dev.1 (please note there is no v here) is the branch and an updated package description field.

balmas commented 4 years ago

3. I think we might need to increase the 4th component of the version numbers every time we commit a change we want to test. Because what I was seeing today was that lerna will not update the packages from GitHub on new commits - the version number has to change. So if I want to test a package that is outside of alpheios-core, if it's version number doesn't increase, I can't get it down from GitHub.
That makes sense to me. I think if we're testing locally we an use filesystem dependencies that will be updated automatically but if we want a version to be consumed by something over the GitHub then a prerelease version needs to be increased: 1.2.3-dev.1 -> 1.2.3->dev.2. Prerelease version does not have much meaning other than the separation of one build from the other and we can change it freely.

If we decide to to increase prerelease versions during feature development both feature A and B will start with the same version that in the example workflow is 1.2.3-dev.3. So if each feature branch will increase a prerelease version in its own way versions may overlap: both feature branches A and B might have version 1.2.3-dev.5. To avoid that, we can use ranges: feature A can start with 100 and feature B can start from 200. Then feature A versions will be 1.2.3-dev.100, 101, 102, ... and feature B versions will be 1.2.3-dev.200, 201, 202, ...

The more elegant solution would be to use prerelease postfixes that will specify the feature (1.2.3-feature-a.1 ad 1.2.3-featue-b.1) but that will be harder to automate because we have to specify prefix in a unique way during a start of development of a new feature.

But maybe, even if versions overlap, but branches (tags) are different, Lerna will be able to pull updates regardless. I will experiment to see how it works exactly.

I think there may also be a difference when the dependency is inside alpheios-core and when it is out. When it is inside, it automatically uses the local filesystem and there isn't an issue, right? but when the dependency is outside of alpheios-core (such as with fixtures) then the strategy needs to be different.

balmas commented 4 years ago

As lerna bootstrap documentation says, lerna bootstrap uses npm install, so whatever npm install will decide to install will be installed or updated. If package is located within a remote repository (which GitHub is) it will be referenced in package.json as github:alpheios-project/project-name#<commit-ish>, where is a branch or a tag. So, I think, even if the version within package.json stays the same but a branch or a tag is changed, npm install will pull and install an updated package.

I did a test with fixtures: did not increase it's package version but created a new branch for it, specified that updated branch in dependencies of client-adapters and run lerna bootstrap. As a result, an updated fixtures package was installed for client-adapters. Lerna also noticed a commit-ish change: EHOIST_PKG_VERSION "alpheios-client-adapters" package depends on alpheios-fixtures@github:alpheios-project/fixtures#1.2.0-dev.1, which differs from the hoisted alpheios-fixtures@github:alpheios-project/fixtures.

So with all this updating version number within package.json may not be necessary. @balmas, can you please test if it works for you when you will have time. The difference between fixtures#v1.2.0-dev.0 and fixtures#1.2.0-dev.1 (please note there is no v here) is the branch and an updated package description field.

the issue I was having was with commits within a single branch. E.g. I created a new arethusa branch on fixtures, did a commit and pushed to github. Tested, and realized I needed to make another change to fixtures. So I did another commit in the same arethusa branch and reran lerna bootstrap. It didn't get the updated package. This makes sense if what bootstrap does is call npm install because that also would not have updated from GitHub when a branch changed.

balmas commented 4 years ago

1. Is the following correct:  in the fast track workflow, when fast-tracking feature C to QA we skip the merge back to master of the feature C branch and move merge right from the dev branch to qa?
Here is what I thought about that: When we complete development of a fast-track feature C, we merge it master, but we do not merge master to a feature C branch. The reason is that by the time development of C is completed master may have other features merged, A and B. If we merge master to C then C will get changes from A and B. But we don't want them to be tested with C, we want C bypass A and B and go before them (if I understand things correctly). That's why we do not need to merge master to C. I think in a fast-track schema feature C is special and should be isolated from other features until merged to release. Does it make sense?

Ah, yes I missed that difference about the merge of master into the feature branch. I think it makes sense.

kirlat commented 4 years ago

@balmas, @irina060981: I've added a more detailed descriptions of git workflows to https://github.com/alpheios-project/documentation/tree/master/development/git-workflow. Please let me know what do you think.

balmas commented 4 years ago

Initial question: can we apply the same workflow to the auxiliary repos (i.e. non-alpheios-core) so that we can have a consistent process across everything? Can/should we use lerna in these repos?

kirlat commented 4 years ago

Initial question: can we apply the same workflow to the auxiliary repos (i.e. non-alpheios-core) so that we can have a consistent process across everything? Can/should we use lerna in these repos?

In theory we could but that would require us to emulate a directory structure, i.e. to have a packages directory with only one package in it.

I think it is a great idea to standardize the process, I will read a think about what we can do here and what would be the simplest process to achieve that.

kirlat commented 4 years ago

There is a release-it tool that has a functionality similar to Lerna's in many ways. Maybe we can use it for "regular" repositories and provide consistency at the level of npm scripts, i.e. a Lerna monorepo and a regular repository will have same npm scripts with same functionality. With this, we can probably have a unified workflow. What do you think?

balmas commented 4 years ago

that sounds like a good suggestion!

I will review the workflows in more detail this afternoon and provide fuller feedback on them soon.

Thank you!

balmas commented 4 years ago

Some additional questions.

In the standard workflow when a feature is moved to QA from a master branch, how do we make sure that the QA branch is exactly the same as master at the start of QA? Do we always delete any pre-existing "QA" branch and create a new one when moving to QA? Or do we do the same as with the "release" branch and first merge "QA" to master before merging master to "QA"? Similarly, with the fast-track workflow, do we delete and recreate the qa-ft branch each time?
For the feature branches, do we also continue our practice of creating new branches, one per issue, for bug fixes? Shall we officially standardize our naming practice ? e.g. feat-<feature-name> for a big new feature, issue-<issuenumber> for a single bug fix? Do we allow working on multiple bug fixes in a single issue-* branch? Is there any value in micromanaging at this level?
In release after QA approval we have:

Merge changes from the release branch to master. Do not change the major.minor.patch version number in master because it is the same or higher as in release. Increase, however, a prerelease version: if master version is 1.2.4-dev.0 and a release version 1.2.3-rel.0, change master version 1.2.3-dev.1. I think there is a typo and this should be "if master version is 1.2.4-dev.0 and a release version 1.2.4-rel.0, change master version 1.2.4-dev.1." ?
Is there a valid scenario when master is a higher major/minor/patch version number than release?

kirlat commented 4 years ago

In the standard workflow when a feature is moved to QA from a master branch, how do we make sure that the QA branch is exactly the same as master at the start of QA? Do we always delete any pre-existing "QA" branch and create a new one when moving to QA? Or do we do the same as with the "release" branch and first merge "QA" to master before merging master to "QA"? Similarly, with the fast-track workflow, do we delete and recreate the qa-ft branch each time?

I think the simplest way would be to delete both qa and qa-ft branches after testing is complete. That is a lesser effort than to do an extra merging to keep them in sync. What do you think?

kirlat commented 4 years ago

For the feature branches, do we also continue our practice of creating new branches, one per issue, for bug fixes? Shall we officially standardize our naming practice ? e.g. feat- for a big new feature, issue- for a single bug fix? Do we allow working on multiple bug fixes in a single issue-* branch? Is there any value in micromanaging at this level?

On my opinion, it is clearer if we have each feature/issue in their own branch. But if several issues are small and, especially, related, we can probably combine them together, as en exception. The feature and issue branches will be mostly short-lived and owned by a single developer so I think it does not matter much how are they named as long as each of us remembers his/her branch names. But I think we can issue some (probably not very obligatory) recommendations about how branches should be named.

I like the naming schema that you've suggested. The only thing is that for issues we should probably specify which repository this issue is from. Let's say we have an issue in components with a 395 number, and it will require a build of components, webextensions, and embed-lib. So if the webextension's branch will be called issue-395 how would we understand that 395 is the number of an issue in components, not in a webextension?

I've also seen Lerna documentation referring to the feature/<feature-name> branch naming schema as a widely accepted convention so we could consider this as an option too.

What do you think about the branch naming approach we should follow?

kirlat commented 4 years ago

Is there a valid scenario when master is a higher major/minor/patch version number than release?

Let's say we had master at version 1.2.3 and it went to QA. Then, during QA testing, we might start develop several new features. They might be completed and merged to master before the QA testing of version 1.2.3 is complete. As qa is merged to release, release branch will have code with version 1.2.3 while the master, if there were one minor and one patch features added to it, will have version 1.3.1. That's the scenario I was thinking about. Does it make sense?

balmas commented 4 years ago

In the standard workflow when a feature is moved to QA from a master branch, how do we make sure that the QA branch is exactly the same as master at the start of QA? Do we always delete any pre-existing "QA" branch and create a new one when moving to QA? Or do we do the same as with the "release" branch and first merge "QA" to master before merging master to "QA"? Similarly, with the fast-track workflow, do we delete and recreate the qa-ft branch each time?

I think the simplest way would be to delete both qa and qa-ft branches after testing is complete. That is a lesser effort than to do an extra merging to keep them in sync. What do you think?

ok

balmas commented 4 years ago

For the feature branches, do we also continue our practice of creating new branches, one per issue, for bug fixes? Shall we officially standardize our naming practice ? e.g. feat- for a big new feature, issue- for a single bug fix? Do we allow working on multiple bug fixes in a single issue-* branch? Is there any value in micromanaging at this level?

On my opinion, it is clearer if we have each feature/issue in their own branch. But if several issues are small and, especially, related, we can probably combine them together, as en exception. The feature and issue branches will be mostly short-lived and owned by a single developer so I think it does not matter much how are they named as long as each of us remembers his/her branch names. But I think we can issue some (probably not very obligatory) recommendations about how branches should be named.

I like the naming schema that you've suggested. The only thing is that for issues we should probably specify which repository this issue is from. Let's say we have an issue in components with a 395 number, and it will require a build of components, webextensions, and embed-lib. So if the webextension's branch will be called issue-395 how would we understand that 395 is the number of an issue in components, not in a webextension?

I've also seen Lerna documentation referring to the feature/<feature-name> branch naming schema as a widely accepted convention so we could consider this as an option too.

What do you think about the branch naming approach we should follow?

For issues, I have been using issue-<number> when the issue is in the same repo as the code and issue-<repo>-<number> when the issue is from a different repo but I think the latter scenario is less likely now with the monorepo, and if there is a feature or bug which crosses repos, then I think there should really be a corresponding issue in each repo.

I am a little ambivalent about whether we use issue- and feat- or the longer format issue/ and feature/ . the longer version is more readable but harder to type.

balmas commented 4 years ago

Is there a valid scenario when master is a higher major/minor/patch version number than release?

Let's say we had master at version 1.2.3 and it went to QA. Then, during QA testing, we might start develop several new features. They might be completed and merged to master before the QA testing of version 1.2.3 is complete. As qa is merged to release, release branch will have code with version 1.2.3 while the master, if there were one minor and one patch features added to it, will have version 1.3.1. That's the scenario I was thinking about. Does it make sense?

That scenario seems plausible, but as I read what is in the workflow document, it would seem to imply that when release 1.2.3 finally gets merged to master, master would revert back from 1.3.1 to 1.2.3, which doesn't make sense to me.

kirlat commented 4 years ago

For issues, I have been using issue- when the issue is in the same repo as the code and issue-- when the issue is from a different repo but I think the latter scenario is less likely now with the monorepo, and if there is a feature or bug which crosses repos, then I think there should really be a corresponding issue in each repo.

It does matter which schema to use for me too so if issue-<number>/issue-<repo>-<number> worked well for us why not to stick with it? @irina060981, are you OK with it?

kirlat commented 4 years ago

That scenario seems plausible, but as I read what is in the workflow document, it would seem to imply that when release 1.2.3 finally gets merged to master, master would revert back from 1.3.1 to 1.2.3, which doesn't make sense to me.

I will edit it to state clearly that it does not. Thanks for catching this!

irina060981 commented 4 years ago

It does matter which schema to use for me too so if issue-/issue-- worked well for us why not to stick with it? @irina060981, are you OK with it?

From my point of view - we tried to use bug-<issue-number>-<description> feat-<issue-number>-<description> fix-<issue-number>-<description>

and so on to define the content of the change if we would use always issue-<issue-number>-<description>

then we don't use issue word here with sense. as we sould have only one repo - then there is no need to retype repo each time.

So we would have only the following valuable information - <issue-number>-<description> all other parts are not valuable

balmas commented 4 years ago

I agree with @irina060981 's point. I have to confess I haven't been regularly following those guidelines, particularly vis-a-vis including the issue number in a feature branch, but I think this is a good schema to follow.

I would add one other possibility though:

dev-<description>

when working on development for a exploratory feature which may not necessarily be reflected in an issue yet.

Does that sound reasonable?

kirlat commented 4 years ago

I think these are all good suggestions and we can add them into recommendations

balmas commented 4 years ago

Summary of minimum workflow we need to support re versioning

1) In qa and production builds of applications that use alpheios-components (i.e. webextension and embed-lib), we need to be able to display in the UI the version of the alpheios-components package that was used in the build.

2) in qa and production builds of webextension and embed-lib we also need to be able to display in the UI the version of those applications that was used to create the build.

3) version information included in the code should match the version in the release tag in GitHub

kirlat commented 4 years ago

Some after thoughts on the tagging schema: Branches and tags coexist within the same space and can be used interchangeably by many git tools. I think it will be good to devote a first letter to indicate what type of reference it is (i.e. a tag, a branch, and with what purpose). I'm just afraid that if we'll start an issue tag with just a number it might not be clear what it is, especially for outsiders. It might be confused with the version number, since version numbers are often used without the v prefix (see https://stackoverflow.com/questions/2006265/is-there-a-standard-naming-convention-for-git-tags, for example). What if we, to avoid that, will use the following schema: v1.2.3 - tags for production releases, indicate a software version 'i123ori123-for issues 'i123-repoName or i123-repoName-<issue-description> for issues specified in the other repository. Even with monorepo, there will be situations when several repositories are affected. For example, an issue in LexisCS repository may require changes in both LexisCS, alpheios-core, and then special development builds of embed-lib and/or webextension. b1.2.3 - For branches that specify a version number, to distinguish them from version tags. Maybe we can always start a branch name with b. Not sure what to do with tagged commits. They currently use branchName.YYYYMMDDCCC schema (i.e. qa.20200304234 or ftqa.20200304234), but maybe we can change them to fit the schema. Probably we can use something like c20200304234.qa to indicate that it is a commit tag.

So my immediate suggestion is to add a i in front of the issue number to make it easier to recognize and avoid confusion with version numbers. Since it's just one letter, we will not waste much space on it. The rest we can decide later.

@irina060981, @balmas, what do you think?

irina060981 commented 4 years ago

I think it could be useful.

balmas commented 4 years ago

fine with me

balmas commented 4 years ago

implemented and documented in the readme.

alpheios-project / documentation

Git branching #21