Release Process and Versioning

nknize commented 3 years ago

What is our release cadence

Going forward, we will try encouraging a time based “release train” model with a process that is similar to, but slightly modified, the Apache way of releasing (e.g., Lucene’s release process). This means we strive to have regularly scheduled and loosely planned releases which provides a sense of “predictability” for our customer base. To minimize merge conflicts, and to keep the team in sync, this approach requires “feature freezes” selected by a volunteer Release Manager (RM) indicating the cut-off time a feature Pull Request (PR) is opened and ready to go for review and inclusion in the release. If the feature is not ready, then the train leaves without you and you’ll have to catch the next one. Here is a general outline of the process for a major / minor release:

A committer volunteers to be a release manager (or an email is sent requesting a volunteer) and initiates the process by sending an email to the committer list of intent to release the next version (to facilitate discussion and feedback)
The RM then proposes a “feature freeze” date (and intent to cut the branch).
1. initial email also requests all committers begin marking “blocker” issues preventing a successful release
24 hours prior to feature freeze (cutting the branch):
1. RM sends a “final notice“ email and request to ensure all blockers are identified (some may be fixed by then).
On day of feature freeze:
1. RM sends a “feature freeze” notice email that the branching is underway and no more PRs should be merged.
2. RM branches from the .x branch into a new .m+1 branch. (.x is now the next minor)
3. RM sends a “branching complete” email that the branch has been cut and a reminder of the release date requesting all blockers be cleared in time of the release
24 Hours prior to release:
1. RM sends a “final notice” for blockers email; also requesting a “speak now or forever hold your peace” on critical blockers. If something is holding up the release a postponed release may occur.
On Day of Release:
1. RM tags the branch with the release version
2. RM begins building and signing artifacts
3. RM stages artifacts to maven repo
4. an email is sent once artifacts are published requesting a quorum vote which includes committers run smoke tests on the staged artifacts;
5. RM sums votes
6. RM publishes tested artifacts

What model of versioning will be used?

Starting with our new Forks, we will start using semantic versioning based on Apache’s model.

From Apache’s versioning doc :

A release number is comprised of 3 components: the major release number, the minor release number, and an optional point release number. Here is a sample release number: 2.0.4 and it can be broken into three parts:

* major release: 2
* minor release: 0
* point release: 4

The next release of this component would increment the appropriate part of the release number, depending on the type of release (major, minor, or point). For example, a subsequent minor release would be version 2.1, or a subsequent major release would be 3.0. Note that release numbers are composed of three integers, not three digits. Hence if the current release is 3.9.4, the next minor release is 3.10.0.

How will OpenSearch and OpenSearch-Dashboards versions be related to each other?

NotKibana and NotElasticsearch will release major version together. They will NOT synchronize minor release — whenever the team feels they’re ready to release a minor version or patch (modulo the schedule above), they should release.

What we guarantee is that any major release of NotKibana is compatible with the same major release of NotElasticserach. For example: 3.2.1 of NotKibana will work with 3.0.4 of NotElasticsearch, but 2.3.1 of NotKibana is not guaranteed to work with 3.0.4 of NotElasticsearch

Breaking Changes and Backwards Compatibility

We will not release any breaking changes except in major releases.

How will plug-in versioning work?

All Plug-ins for NotKibana and NotElasticsearch should use the same style of Apache versioning. However, going forward we should plan for plug-ins to be truly standalone pieces of software that can be released separately from notKibana or NotElasticsearch within a major version. We will therefore not enforce the same semantic versioning of guaranteed compatibility within major versions as we do with NotKibana and NotElasticsearch. Each plug-in team is free to create / maintain a compatibility matrix if it makes sense for that plug-in team.

However, practically for the plug-ins that ODFE owns, we may choose to release on the same cadence as notKibana until we are at a state that they can practically release on their own.

We will also consider making the plugin incubation process as easy as possible for the community. For example, we will explore ways of adding gradle or npm tasks that automatically generate a boilerplate plugin project (e.g., ./gradlew newPlugin myNotESPlugin) that set the default starting version as 0.0.0 and uses the same compatibility version of NotES or NotKibana as the one being used to generate the plugin.

aetter commented 3 years ago

Have we considered the extent to which this could lead to upgrade headaches for users? Right now, if I want to upgrade ODFE from 1.4 to 1.13.1, I know that I also need to grab all of the 1.13.1 plugins that I care about.

With independent plugin releases, I have to go to a compatibility matrix and find the latest version of two plugins (e.g. security-notelasticsearch and security-notkibana) that are compatible with 1.13.1 and then repeat the process for any other plugins I care about: two more versions for ISM, Alerting, SQL, etc. If I want every plugin, I have to manually identify the correct versions for 14 separate plugins.

A truly independent release cycle also seems like it would have us release new versions of NotElasticsearch that didn't support our plugins. So we release NotElasticsearch 1.2, a user gleefully upgrades, and then the user realizes there's no version of the security plugin that supports NotElasticsearch 1.2. We can work around this issue by holding up the NotElasticsearch release, but in that case, why not version everything together?

Basically this proposal seems like it's designed for the convenience of the development teams at the expense of the user experience.

nknize commented 3 years ago

I know that I also need to grab all of the 1.13.1 plugins that I care about.

The fidelity is important. There's no reason an upgrade to a bugfix revision (e.g., 1.13.1) should require a comprehensive upgrade of the entire stack (core and all). Likewise, users should be able to upgrade to 1.13.9 of plugin and still run with 1.13.0 of the core. Extend this further; there shouldn't be breaking changes in minor releases (maybe deprecation warnings) so users should be able to upgrade to 1.25.2 of a plugin and run with 1.2.2 of the core. That's the proposed philosophy. Locking major.minor.revision across all plugins has the drawback of tightly coupled coordination across independent plugin repositories, teams, projects, groups, communities, etc, which leads to less agility and more complexities in lean rapid release cycles. Synchronizing across majors (which may only be released once a year, or once every 16 months) enables longer term feature development and adoption of major breaking changes across the entire stack.

I have to go to a compatibility matrix

Not if we go with the idea of breaking changes only allowed in majors. 1.x is compatible across the stack. 2.x is compatible across the stack and backwards compatible with 1.x. 3.x is compatible across the stack and backwards compatible with 2.x but not 1.x... rinse and repeat.

We can work around this issue by holding up the NotElasticsearch release, but in that case, why not version everything together?

Honestly, I hope some of the plugin teams (e.g. ML, security) turn into their own large communities and grow to be bigger than many individual OSS projects. This could create an environment of feature development that's faster than ODFE ever could have coming from a standalone company. I can't imagine trying to coordinate every single release with organic groups that big; but I'd love to hear ideas from folks that have had success with that model. IMHO, locking across minor.revision is preventative to organically growing large standalone communities around the plugins.

aetter commented 3 years ago

Sure, that all sounds great ("Use proper semantic versioning for everything."), which we've unfortunately not done up to this point. Users should be able to look at the version strings and quickly know what works with what, but the original proposal says something quite different:

We will therefore not enforce the same semantic versioning of guaranteed compatibility between major versions as we do with NotKibana and NotElasticsearch. Each plug-in is responsible for maintaining its own compatibility matrix.

The way that reads to me is that plugins can introduce breaking changes to their public APIs whenever they feel like it, increment their major versions, and suddenly you need version 6.1.1 of alerting and 8.2.0 of ISM for compatibility with NotElasticsearch 1.14.0. Can you clarify those sentences? They seem pretty explicit, but am I somehow misinterpreting?

nknize commented 3 years ago

No.. I think that's a side effect of multiple edits on an original document that was posted here to open up. You're absolutely right that reads incorrectly. Thank you for reading through and pointing it out.

rursprung commented 3 years ago

you have the link to the apache versioning docs and they're talking about semver. would it make sense to also explicitly state that you'll be following https://semver.org/?

note that i'm all in favour of doing semver! i had already outlined my ideas here: https://discuss.opendistrocommunity.dev/t/versioning-concept-for-the-fork/5297/4

dblock commented 3 years ago

What does Lucene do in terms of release process and versioning? @nknize

mihirsoni commented 3 years ago

Hi @dblock Lucene follows the main being always next major version.

i.e if it starts with 1.0 the main will be 2.0 and have corresponding branch for 1.x. All new features goes in to main and then get back ported to 1.x.

There are two possible paths to move forward

Follow what Lucene follows, which makes our main to 2.0 and it can take all the new features and back ported to 1.x as needed.
Keep main as stable and current version, i.e main branch will be 1.0 and cut a branch when release is scheduled.

I believe we can go either one of them, but for I would prefer to go with main as 1.0 and then make the decision later on.

@nknize what I feel if we make man 1.0 will it block others Open PR merge ?

CEHENKLE commented 3 years ago

Hey Folks;

Circling back to this, because we learned some things trying to pull together beta-1:

Versions and Tags: Lemme put a proposal on the table (note, this is just for Engine. Plugins and Dashboards can define their own rhythm). These are my thoughts, so I'd love to hear yours:

Branches

Between now and General Availability (GA), let's have two branches: 1.x and Main. After GA, we can have three branches: 1.0, 1.x., Main

Some definitions:

Main is our next major release. This is the location that all merges should take place. It's going to moving fast and pretty dynamic in there.

1.x Is our next minor release. Once something gets merged into main, we may chose to backport it to 1.x.

1.0 is our current release. In between minor releases, only hotfixes (security and otherwise) would get backported to 1.0.

For reference, we're also thinking of having a couple of release on the way to GA:

Beta-1
Beta-2 Release Candidate GA (1.0 official).

If things change, we may add or remove releases, but that's how it looks to me right now.

Between now and GA:

We'll have two branches: 1.x and Main.

When a PR is reviewed we'll apply the next major version label (e.g., 2.0) and if accepted, merge it into Main. If the requestor thinks it should be backported and released with 1.0, they should open a separate PR and the reviewer will label it with the 1.x label. Then we'll merge the new PRs to the 1.x branch.

After GA

Once we tag our 1.0, we'll create a separate 1.0 branch. That will be our stable, rarely changing version. Main will still be our next version, and 1.x our next minor release. I would like to get us on a monthly release cycle, but again, we'll see how it goes once we're out in GA.

I'd also like to see nightlies of all three branches (1.0, 1.x and 2.0) so we can rapidly find regressions, but I'll need to hash that out with infra.

Why do it this way?

The benefit of doing it this way is that main can evolve quickly, while we're a little more circumspect about 1.x and 1.0. It'll also make it easier for us to release if everything is clearly tagged (not having clear tagging made this release kind of a potchke).

Release Cadences:

As I mentioned, we learned some stuff from beta-1. The first thing we learned is that 24 hours is not enough time to wrangle everything we need to prepare for the release :) Right now we need at least a week to get everything bundled up and verified together.

As we get better at this, I'm looking forward to tightening up the schedule, but given the constraints, here's how it would look:

2 weeks before the date we're aiming at, we'll send an email to all forum participants, letting them know that in 1 week we'll be tagging our release. Since we don't have mailing lists, this seems like the best communication method we have. But if you've got a better ideal, let me know.

1 week before the date, we'll tag the release.

On the zero date, we'll release artifacts to docker.

Right now everything we're doing is manual, so we may need to move timings around as we get better.

Thoughts?

Thanks, /C

rursprung commented 3 years ago

does it make sense to already separate 2.x and 1.x now? it's probably too late (you already have the branches), but unless you know that you're already landing breaking changes for 2.x now i'd have focused on getting 1.0.0 out first and only later moved to the model you've proposed.

when i read your comment i first feared that you'd name the actual releases beta1 but i've now seen that you tagged them as 1.0.0-beta1 which is in line with semver. maybe you could update your comment just to make this clearer? (i presume you have no intention of changing to just beta1 as a tag?)

for branch naming, i was wondering if release/* (borrowed from git flow - though the whole develop thing doesn't really apply here since that only works for devops setups) would make more sense, it'd give the branches a clearer name. so once v1.0.0 is out there'll be a commit on main which sets this version number (i presume the version is hard-coded somewhere? unless you always extract it from git? didn't check the code, sorry) and is tagged accordingly and there's a branch release/1.0.x off of that commit. if/when work needs to be done which has to be published as 1.0.1 this can be done on that branch. and if main is already working on 2.0.0 then there's a release/1.x branch in which 1.1.0 & later are being developed. though this only makes sense if you know that there'll be additional stuff on the minor/patch branches. otherwise i generally prefer to leave out the patch-branch (e.g. release/1.0.x) and instead just have the minor branch (e.g. release/1.x) and decide at release-time whether this becomes 1.1.0 or 1.0.1. this avoid additional efforts for cherry-picking. the nice thing with git is that you can still create branches later.

very important: looking at your branches, you have two feature branches on the upstream repo, one of which already seems to be merged. i'd propose to get rid of those branches - feature branches should never be on the upstream repo. upstream should be canonical, i.e. only contain actual releases (+ develop + master/main).

the more branches you have the harder it is to keep track of what has been cherry-picked where. i try to have as few branches as possible (see e.g. above on avoiding unnecessary patch-branches), but this isn't a cure-all solution. some kind of overview and/or tooling is probably also needed to keep track if there's a big-enough inflow of new commits happening. i'm not sure what kind of tooling exists here for github, but maybe somebody has some experience here?

regarding the communication via email: i'd suggest to instead use announcement posts in the forum. people can then decide on how they want to configure their notification settings in the forum (whether they want to receive notifications via email or not). this gives them an option to opt-out of the emails. otherwise you'd also spam people which might have once been active in the forum but have dropped out long ago and are no longer interested. (and with GDPR & co. it's questionable whether you'd officially have the consent to just send them an email - with no opt-out option - about an opensearch release if they only signed up in a forum about opendistro years ago)

nknize commented 3 years ago

i was wondering if release/* (borrowed from git flow - though the whole develop thing doesn't really apply here since that only works for devops setups) would make more sense,

I think the motivation is to minimize branches (because more branches can put quite a demand on CI). In this configuration I don't think any branch will be anything other than "release/" (although that strategy might be good to use for the tags)? We also would use tags, not branches, for patch releases. So there wouldn't ever be a 1.0.1 branch... 1.0 would be the next 1.0.1 release and 1.x would be the next 1.1 release. In this manner, here's what the future would look like:

main == 3.0.0 alpha
2.x == 2.4.0 alpha
2.3 == 2.3.5 (tags for all the rest)
2.2 == 2.2.8 (tags for all the rest)
2.1 == 2.1.4 (tags for all the rest)
2.0 == 2.0.6 (tags for all the rest)
1.x == 1.5.0 alpha
1.4 == 1.4.6 (tags for all the rest)
1.3 == 1.3.4 (tags for all the rest)
1.2 == 1.2.5 (tags for all the rest)
1.1 == 1.1.6 (tags for all the rest)
1.0 == 1.0.4 (tags for all the rest)

i'd propose to get rid of those branches - feature branches should never be on the upstream repo.

I agree the merged branches should be deleted. I do think, though, some big sweeping change feature branches that are being collaborated by the community might be worth keeping in the upstream repo. Say the community wants an enhancement to switch to a pluggable translog framework. This would be a big shift in the architecture and might benefit from full transparency than collaborating in some development fork somewhere? What do you think?

the more branches you have the harder it is to keep track of what has been cherry-picked where.

100% This is why deprecation / tagging is going to be super important. We need to collectively decide on a deprecation strategy because that implementation has been inherited in this codebase. I personally like a compatibility module approach like Lucene takes, but that's just me. The predecessor codebase does not take this approach and, instead, supports compatibility by littering if (indexCreatedVersion.onOrBefore(Version.V_1_2_0) checks all throughout the codebase. In this example, after 3.0 is released this logic block would be removed if the feature was deprecated in 2.0. This is very difficult to keep track of, and gives a hard and fast rule to end of life deprecated features that the community may not want to carry forward.

regarding the communication via email: i'd suggest to instead use announcement posts in the forum.

Thankfully the discourse forum supports posting from email; so I think we'll be able to support announcements to multiple places from one email message.

camerski commented 3 years ago

"I think the motivation is to minimize branches (because more branches can put quite a demand on CI)"

I'm not sure we need to worry too much about this. The CI load is measured in terms of number of commits, regardless of which branch they are on. Backporting the same commit to multiple branches could be an issue, but I would expect this to be rare...only critical bugfixes (e.g. security flaws) are likely to be backported.

I think a model that discourages backports (apart from critical fixes), and specifying life cycles for release series (e.g. "in development", "live", "maintenance mode" and "end-of-life") would address the cherry-picking problem. Critical fixes are backported to "live" and "maintenance" release series; EOL series are not touched. To extend your example above, we could choose a current + last minor version support model:

main (In development) == 3.0.0 alpha
2.x (In development) == 2.4.0 alpha
2.3 (Live) == 2.3.5 (tags for all the rest)
2.2 (Maintenance) == 2.2.8 (tags for all the rest)
2.1 (EOL) == 2.1.4 (tags for all the rest)
2.0 (EOL) == 2.0.6 (tags for all the rest)
1.x (In development) == 1.5.0 alpha
1.4 (Live) == 1.4.6 (tags for all the rest)
1.3 (Maintenance) == 1.3.4 (tags for all the rest)
1.2 (EOL) == 1.2.5 (tags for all the rest)
1.1 (EOL) == 1.1.6 (tags for all the rest)
1.0 (EOL) == 1.0.4 (tags for all the rest)

Other models may also apply. Perhaps we want to have an LTS model for some release series?

dblock commented 3 years ago

Where do we want to document this? RELEASING.md?

dblock commented 3 years ago

I took a stab at the beginning of documenting the release process in https://github.com/opensearch-project/OpenSearch/pull/853. I am thinking that because we want this across all projects in opensearch-project the meat of the content would go into the .github repo. Specifics to this repo can be added to the .md files here. I have not written up cadence and versioning (yet), we possibly want that in project-website.

peternied commented 3 years ago

[...]

Some definitions:

Main is our next major release. This is the location that all merges should take place. It's going to moving fast and pretty dynamic in there.

1.x Is our next minor release. Once something gets merged into main, we may chose to backport it to 1.x.

1.0 is our current release. In between minor releases, only hotfixes (security and otherwise) would get backported to 1.0. [...]

/C

@CEHENKLE @nknize Did you have thoughts on how we are going to make/document the backporting decisions? Seems like we had near miss on backporting #809

Mulling it over, if we had a routine way to collect all the PRs/Issues that arrived after cutting a release branch, and then annotation with something like backport-pending, backport-complete, or no-backport tags. It seems like the release maintainers would be responsible for this triaging, which might happen alongside the existing triage process.

dblock commented 3 years ago

This ticket probably belongs in https://github.com/opensearch-project/opensearch-build at this point and relates to https://github.com/opensearch-project/opensearch-build/issues/87 and https://github.com/opensearch-project/opensearch-build/issues/73. @CEHENKLE wdyt about moving it?

dblock commented 3 years ago

I am going to close this. Beyond https://github.com/opensearch-project/OpenSearch/pull/853, the release train has become an issue template to follow in https://github.com/opensearch-project/opensearch-build/pull/531 and is documented in https://github.com/opensearch-project/OpenSearch-build#making-a-release. We can iterate on that, and keep writing down and improving any of the aspects asked here via pull requests.

opensearch-project / OpenSearch