[Campaign] Ensure 1.x and 2.x branches (main should be 3.0)

dblock commented 2 years ago

What kind of business use case are you trying to solve? What are your requirements?

Currently plugins follow a branching strategy where they work on main for the next development iteration, effectively working on 2 versions at the same time. For example, at the time of writing plugins are working on 2.0 (main) and 1.3.2 (1.3). In contrast, OpenSearch works on 3 releases at the same time. Currently on 3.0 (main), 2.0 (2.x), and 1.3.2 (1.3).

What is the problem? What is preventing you from meeting the requirements?

There's no OpenSearch 3.0 with all plugins continuously integrating, causing late integration, and bugs discovered during integration instead continuously.
Features and fixes (e.g. CVEs) from main branches may need to be selectively backported into multiple 1.x and 2.x releases. Without an intermediate 1.x or 2.x branch there are as many merge conflicts as branches.
Without a 2.x branch, developers don't have an easy way to express "backport to any next 2.x release" and must be aware of the state of the next 2.x (e.g. does it accept new features?).
Workflows are different between OpenSearch core and plugins, which is confusing.
Authors of breaking changes in OpenSearch for the next major release don't have a place to fix downstream plugins, even if they wanted to.
https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching is incorrect.

What are you proposing? What do you suggest we do to solve the problem or improve the existing situation?

Follow OpenSearch core branching, described in https://github.com/opensearch-project/opensearch-plugins/issues/35. Create 1.x and 2.x branches, do not create 2.0 as a branch of main, instead create main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time.

Exit Criteria

[x] main is building 3.0
[x] 2.x is building next 2.x, which is currently 2.4.0
[x] 2.3 is building next 2.3, which is 2.3.1
[x] manifests in https://github.com/opensearch-project/opensearch-build/tree/main/manifests include all 3

peternied commented 2 years ago

@dblock One day ago security choose to delete its 2.x branches because there is no active 3.x feature development, creating a 2.x when the need arises.

The aim of deleting this branch was to was to

Control when 3.x breaking changes are scheduled to be worked on
Reduced the amount of backporting management, considerations such questioning did a change go to the correct branches?
Reduced the number of pull request branch targets to check for in a pull request to one location

While 3.0 will be released in the future and its important, these 'papercuts' impact a smaller project. If we had better mechanisms or potentially a better starting posture we would more quickly adopt the 2.x branch

dblock commented 2 years ago

@peternied Do you not feel that having a working 3.0 build of the distribution is a valuable goal?

peternied commented 2 years ago

@dblock The maintainers of security had a discussion yesterday where this topic was raised. While there are pain points as mentioned, ultimately we are in agreement we should move in this direction. By running through the process and connecting data we can generate more topics for the retro(s) if we have recommendations or the process isn't tenable.

cliu123 commented 2 years ago

I like the idea of building 3 different versions in parallel to let OpenSearch breaking changes fail plugins early. Security plugin tried building OpenSearch 3.0 on main branch. But yesterday some breaking changes in OpenSearch core 3.0 breaks the BWC, so the security plugin main branch CI failed and blocked the release 2.1.0. With building OpenSearch 3.0 on plugin main branch, all PRs in plugins repos need to be merged to main first -> 2.x branch-> 2.1 branch. Breaking changes in OpenSearch 3.0 makes plugin main branch fragile and potentially block all PRs until failures are resolved. So security plugin had to downgrade from 3.0 to 2.1 to unblock the main branch CI and 2.1.0 release. We need a better mechanism to prevent this moving forward.

navneet1v commented 2 years ago

@dblock One day ago security choose to delete its 2.x branches because there is no active 3.x feature development, creating a 2.x when the need arises.

The aim of deleting this branch was to was to

Control when 3.x breaking changes are scheduled to be worked on

Reduced the amount of backporting management, considerations such questioning did a change go to the correct branches?

Reduced the number of pull request branch targets to check for in a pull request to one location

While 3.0 will be released in the future and its important, these 'papercuts' impact a smaller project. If we had better mechanisms or potentially a better starting posture we would more quickly adopt the 2.x branch

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Also, at what level we are confident that this is the right strategy going forward?

dblock commented 2 years ago

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

dblock commented 2 years ago

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

Core introduces breaking changes into main (now 3.0) all the time. What branch of k-nn should it include in the 3.0 daily build?

navneet1v commented 2 years ago

@dblock the idea suggested in this proposal aligns with the OpenSearch Core but will that be an overkill for plugins that don't have regular changes for different versions?

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

The problem as our team saw with multiple branches was back-porting only and second it brings the plugins and core 1:1 from a branching standpoint which has its own pros and cons, apart from that I am very much aligned with proposal is being provided.

navneet1v commented 2 years ago

For K-NN and geo-spatial we also moved to the approach where we will cut 3.0 branches only when we know that there are breaking changes being introduced. This keeps the back-porting to minimum and requires less maintenance.

Core introduces breaking changes into main (now 3.0) all the time. What branch of k-nn should it include in the 3.0 daily build?

The main branch of K-NN and if that breaks some how then reaching out to the maintainers for fixing it and fix could be cutting a specific branch. The idea is just to delay when we need to cut a branch, for plugins with not a lot of changes.

Even with branches in K-NN we still need to go ahead and fix the breaking changes. It mainly about back-porting a change from main to 2.x branches with every PR.

downsrob commented 2 years ago

The main pain point which I have experienced with breaking changes is that a breaking change will be merged into 3.0 on core, and then the plugins will run CI later on during some PR and experience the breaking change. It can be a time consuming process to track down the breaking change in core versus other dependencies and then longer to find out how to fix this breaking change in the plugins. Additionally, we have seen breaking changes go in, plugins make changes, then core modifies the breaking change and then plugins need to revert. On backporting, I am not hugely concerned, though it is more burden. For example with the non inclusive language breaking changes. When I raise a PR for 3.0 using anything related to master then I need to modify my PR for backports. Not a big deal for the non inclusive language changes but there are breaking changes which could be more burdensome, and if I am trying to get a fix into a plugin repo on code freeze date then I don't want a small change to take a day of backporting or have CI on the main branch be compromised because of some tangent breaking change which blocks my PR from getting merged and backported.

Something to help with the slow feedback loop to help plugins quickly know what kind of changes to make is if core could have another github action which builds all of the plugins. Then we can open a github issue for failing plugins explaining the breaking change and required changes. Running all tests is flaky and slow so we can skip that for now, but the build alone can catch many issues.

dblock commented 2 years ago

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

That means having to catch up with a lot of changes in core "in the end" or "last minute". You're only looking at it from the perspective of a plugin (e.g. index management). Right now there are tons of changes on core 3.0 and there's no index-management 3.0 AFAIK. You should have incremented main of index-management to 3.0 a long, long time ago in sync with core 3.0. So, which branch of index-management should be consumed right now by the 3.0 distribution?

dblock commented 2 years ago

The main branch of K-NN and if that breaks some how then reaching out to the maintainers for fixing it and fix could be cutting a specific branch. The idea is just to delay when we need to cut a branch, for plugins with not a lot of changes.

This is backwards. Core cannot "communicate" downstream about how it's going to possibly break its dependencies all the time, it can only try via campaigns for planned changes. If you adopted the same branching as core and did 3.0 development on main rn, you would be building against 3.0 core all the time, constantly fixing any breakage caused by core breaking changes, and backporting to 2.x ensuring changes work on the next 2.x.

navneet1v commented 2 years ago

Let's say there are no changes in the plugin ever at all. How do you propose we build a 3.0, 2.2 and 2.1 distribution that includes the plugin today?

The idea was simple, before the time of release cut the branch, as there were no changes required the cutting a branch will be simple.

That means having to catch up with a lot of changes in core "in the end" or "last minute". You're only looking at it from the perspective of a plugin (e.g. index management). Right now there are tons of changes on core 3.0 and there's no index-management 3.0 AFAIK. You should have incremented main of index-management to 3.0 a long, long time ago in sync with core 3.0. So, which branch of index-management should be consumed right now by the 3.0 distribution?

Yes, I looked from a perspective of a plugin owner who wants to maintain the plugin with minimal effort or at-least move towards doing minimal upgrades. Keep fixing the plugins as when core introduce a breaking change thats too much expectation from plugins standpoint. I would really like to see how this churn can be reduced.

Nevertheless, I am not against a common branching strategy, but I feel there are many sharp edges to this too and we should try to find those first and see how we can reduce it.

Suggestion: At the time when Core creates a new 3.0 branch why cannot we create version file as we have for lucene which can provide which branch of plugins to use and let plugin owner provide the exact branch name to use there. This will let them decide how they want to develop. Pros:

This will solve the problem keeping branch names consistent and allow plugin owners to keep whatever names they want.
When build breaks due to breaking changes cut automated issues on the repos.

dblock commented 2 years ago

The main pain point which I have experienced with breaking changes is that a breaking change will be merged into 3.0 on core, and then the plugins will run CI later on during some PR and experience the breaking change.

I think we may be able to do something about it, please add your comments to https://github.com/opensearch-project/OpenSearch/issues/3740. That said, breaking changes are breaking changes, and they have a "cost" and will require "work" either way.

dblock commented 2 years ago

@navneet1v Yes! That's what we want. And this is exactly what's asked in this proposal.

First note that plugins must have versions that match core. This is a limitation of the product. Plugin 2.1 cannot start on core 3.0. So at a minimum plugins need to provide a branch where the version is 3.0. The distribution builds everything and runs integration, bcw and performance tests, so the software needs to work.

We do what you suggest with manifests today (read this). When the first 2.0 release start development, core does it on a 2.x branch, and increments main to 3.0. A manifest is created for 3.0 that now includes core 3.0 from main. Plugins are then asked to do the same and add themselves to the manifest pointing to a branch that has 3.0. When the build breaks, plugin tickets are cut.

Instead, today, plugins are saying "we want to reduce backports, so we plan to increment to 3.0 when the last 2.x version, e.g. 2.5 ships, i.e. in 6 months to a year". We're not going to work on 3.0 at all, and not increment main to 3.0 at the same time as core. And the branching strategy proposed doesn't need to match because we don't work on 3.0 anyway, we just need main = 2.0 right now, and tomorrow main = 2.1 and we cut 2.0.

prudhvigodithi commented 2 years ago

Suggestion: At the time when Core creates a new 3.0 branch why cannot we create version file as we have for lucene which can provide which branch of plugins to use and let plugin owner provide the exact branch name to use there. This will let them decide how they want to develop. Pros:

This will solve the problem keeping branch names consistent and allow plugin owners to keep whatever names they want.

When build breaks due to breaking changes cut automated issues on the repos.

Hey @navneet1v two problems I see with this: One, the distribution builds cannot run automatically until the plugin owners feed in the details of the branch for a specific release and to automate, it might not be the same branch pattern for next release. Two, the version increment is still manual here, which right now takes lot of human effort to raise a version increment PR's across multiple plugin repos (Related GH issue: https://github.com/opensearch-project/opensearch-build/issues/1375), since the plugin owner provide the exact branch name to use for each release its hard to the automation the workflows.

peternied commented 2 years ago

since the plugin owner provide the exact branch name to use for each release its hard to the automation the workflows

We have branching guidance published, can we can we automate workflows and raise issues with repositories that cannot be supported?

tianleh commented 2 years ago

Shall https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching be deprecated since it is for 2 releases? @prudhvigodithi

prudhvigodithi commented 2 years ago

Shall https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching be deprecated since it is for 2 releases? @prudhvigodithi

Hey @tianleh the idea is to incline with Core and have 3 branches at given time, main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time., @dblock can you add you thoughts on the document please?

prudhvigodithi commented 2 years ago

We have branching guidance published, can we can we automate workflows and raise issues with repositories that cannot be supported?

Hey @peternied can you add some more details please?

peternied commented 2 years ago

It was mentioned that automation is difficult because the branch naming patterns are not consistent. Rather than having some branches name out of standard to block automation efforts, instead make aligning to the standard patterns of value for processes like automated version incrementing. Then you can file bugs on the plugins/repositories that need to update the naming standard to get the value from the automating version incrementing work.

dblock commented 2 years ago

Shall https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching be deprecated since it is for 2 releases? @prudhvigodithi

Hey @tianleh the idea is to incline with Core and have 3 branches at given time, main -> 2.x -> 2.0. Maintain working CI for 3 releases at any given time., @dblock can you add you thoughts on the document please?

That doc needs to be updated to say that plugins align on core.

amitgalitz commented 2 years ago

Moving main branch on plugins to point to 3.0.0-SNAPSHOT makes sense to me however I had a few questions on the standard for development with this setup:

If main is pointing to 3.0 and we are developing features for 2.2 for example that might use methods that have name changes in 3.0 or utilizes dependencies that have changed in 3.0.

Do we develop this feature from the 2.x branch and merge into 2.x only? What is the standard then for bringing that feature to main? A separate PR that addresses the breaking changes + new feature or we forward port and then address potential breaking changes there?

Or should the standard be to develop to main and for every feature that we develop for a 2.x release we should already be mindful that it should pass the 3.0 CI which sometimes isn't possible on things like core name changes in 3.x vs 2.x?

peternied commented 2 years ago

@prudhvigodithi Thanks for filing issue to track this - could you do a pass on them to mark them as untriaged to make sure they get the attention of the maintainers?

prudhvigodithi commented 2 years ago

Sure @peternied doing that now. Thank you

prudhvigodithi commented 2 years ago

Hey @amitgalitz, If a specific PR is an enhancement that can be used across system, the PR should be merged to main and then backported to desired .x branch. If a specific PR is limited to only specific version, then it’s allowed to directly merge to respective .x branch and backport to its own release branches, example if you have merged a feature to 2.x from 2.x backport to 2.1 and 2.2. Does this answer your question ?

@dblock

amitgalitz commented 2 years ago

Hey @amitgalitz, If a specific PR is an enhancement that can be used across system, the PR should be merged to main and then backported to desired .x branch. If a specific PR is limited to only specific version, then it’s allowed to directly merge to respective .x branch and backport to its own release branches, example if you have merged a feature to 2.x from 2.x backport to 2.1 and 2.2. Does this answer your question ?

@dblock

That makes a little more sense but for example if the only reason the feature isn't across the system cause it needs some code changes to accommadate for breaking changes in next major version. So its now merged in 2.x. Should there just be a new PR to main with the feature but accommodating for the breaking changes?

prudhvigodithi commented 2 years ago

That makes a little more sense but for example if the only reason the feature isn't across the system cause it needs some code changes to accommadate for breaking changes in next major version. So its now merged in 2.x. Should there just be a new PR to main with the feature but accommodating for the breaking changes?

Hey @amitgalitz breaking changes are always for the upcoming major versions, in this case main would be the upcoming major version, anything that is merged to 2.x should be for next minor version and ideally it should not be a breaking change. As per me in this case, a PR is raised to just main and not backported :)

dblock commented 2 years ago

@amitgalitz It seems like one of your biggest concerns is that you're more actively working on 2.x while your main is 3.x. I don't see a way to have it both ways: you have to either code in 2.x and merge back to main or the other way around. Hopefully it's a clean merge either way if the feature is independent.

amitgalitz commented 2 years ago

@amitgalitz It seems like one of your biggest concerns is that you're more actively working on 2.x while your main is 3.x. I don't see a way to have it both ways: you have to either code in 2.x and merge back to main or the other way around. Hopefully it's a clean merge either way if the feature is independent.

I guess that's fair, I guess we will decide as a team which way we do it and potentially add fixes in case we need as we merge from 2.x to main or vice versa.

navneet1v commented 2 years ago

@amitgalitz It seems like one of your biggest concerns is that you're more actively working on 2.x while your main is 3.x. I don't see a way to have it both ways: you have to either code in 2.x and merge back to main or the other way around. Hopefully it's a clean merge either way if the feature is independent.

Hi, If we start to work on main which points to 3.0 we will have snapshot available for OpenSearch for 3.0? I am hoping the ans to this question is yes. Because if is not present then we cannot work on main which points to 3.0 version of OpenSearch.

prudhvigodithi commented 2 years ago

@navneet1v this should cover as part of daily builds https://artifacts.opensearch.org/snapshots/core/opensearch/3.0.0-SNAPSHOT/opensearch-min-3.0.0-SNAPSHOT-linux-arm64-latest.tar.gz https://artifacts.opensearch.org/snapshots/core/opensearch/3.0.0-SNAPSHOT/opensearch-min-3.0.0-SNAPSHOT-linux-x64-latest.tar.gz is this your concern ?

navneet1v commented 2 years ago

@navneet1v this should cover as part of daily builds https://artifacts.opensearch.org/snapshots/core/opensearch/3.0.0-SNAPSHOT/opensearch-min-3.0.0-SNAPSHOT-linux-arm64-latest.tar.gz https://artifacts.opensearch.org/snapshots/core/opensearch/3.0.0-SNAPSHOT/opensearch-min-3.0.0-SNAPSHOT-linux-x64-latest.tar.gz is this your concern ?

Great as long as we have builds then there is no concern. Thanks for providing this info.

joshuali925 commented 2 years ago

Hi @dblock, I agree with current proposal, it solves the mentioned issues, but meanwhile it brings a few pain points for plugins development for us (SQL team). I would like to propose an alternative solution that solves the same issues but is easier for us to follow.

Currently the next major release is 3.0.0 and the next minor release is 2.3.0, I'll use them as examples.

Proposal

Visible to infra

We will provide a 3.0 next major branch and 2.3 next minor release branches for running integrations and future releases. But other than that, each team should have the ability to decide what branching strategy works the best, because development strategy can vary by each team and plugin.

Visible within our team

main branch: a development branch on latest released major version, most of PRs go here
3.0 branch: integration branch with core, contains any SQL breaking changes (very rare)
2.2 branch: latest release branch, bug fixes will be backported here for next patch release
2.3 branch: optional until preparation of 2.3.0 release starts, features planned for 2.3 will go here
2.4 branch: optional until preparation of 2.4.0 release starts, features planned for 2.4 will go here
feature/<name> branch: for managing collaborative work on large features but avoids interfering other developers and core breaking changes
...

Note that only 3.0, 2.3, and historically released branches are visible to infra and core for integration builds, other branches are for development purposes only.

The current proposal tightly couples branching strategy for all plugins with core, which creates some downsides for us:

Plugin pain points with the current proposal

We do not always have features that will go into next major but not minor release, so enforcing core strategy to plugins forces us to maintain unnecessary branches. Doing so has a few side effects:

Assume no plugin breaking changes. To maintain 2.x, developers need to make sure for every PR to main, a backport PR to 2.x is reviewed, approved, and merged. This is because main and 2.x would be the same besides one is on 3.0 and one is on 2.3. This doubles the work to merge a PR for contributor and reviewers.
If breaking changes from core affect the PR to main, the backport PR to 2.x will fail, and contributor will need to adapt the change for 2.x. This doubles the work of creating PRs.
Core can introduce arbitrary number of breaking changes in main before 3.0 release is ready. Having continuous integration on our development branch with an unstable upstream branch will cause troubleshooting and developing overhead.
- An example would be when Observability was upgrading to 2.0, core first removed the API to create index using YAML as mapping source, then added it back before 2.0.0 release. With current proposal, Observability will first investigate an alternative for creating the index, then find out that work is not necessary.
There are no straight forward ways to plan a feature for a future minor release (e.g. 2.6).
- An example would be SQL recently released relevancy based search, but the PRs were distributed across 2.1 and 2.2, thus can be confusing for users. The development started when latest release was 2.0. We could not plan for 2.2 at that time because 2.1 (which will be cut from 2.x) haven't released (unless we remember to only backport to 2.2/2.x after 2.1 release, which is additional overhead).
- With the new proposal, we can have a 2.6 branch beforehand or isolate the change with a feature branch to solve this issue.

Does the new proposal address existing issues

Copying from issue description, I'll comment on each point raised up.

There's no OpenSearch 3.0 with all plugins continuously integrating, causing late integration, and bugs discovered during integration instead continuously.

This is solved by the new proposal with 3.0 branch. Features and fixes (e.g. CVEs) from main branches may need to be selectively backported into multiple 1.x and 2.x releases. Without an intermediate 1.x or 2.x branch there are as many merge conflicts as branches.

I did not fully understand this one. Using 1.x as example, the release branches are 1.3, 1.2, 1.1, 1.0. If there is a CVE fix, how does backporting to 1.x help patch releases of 1.3, 1.2, 1.1, 1.0?

If you are referring to resolve conflicts in 1.x then backport to release branches, that's also true if you resolve conflicts in 1.3 then backport, which additionally makes it one PR less. I would actually prefer using existing tools like git rerere to do this. Without a 2.x branch, developers don't have an easy way to express "backport to any next 2.x release" and must be aware of the state of the next 2.x (e.g. does it accept new features?).

With the new proposal, we can target a future minor branch that accepts new features and have it created beforehand, or isolate the change with a feature branch. Workflows are different between OpenSearch core and plugins, which is confusing.

While it's true that workflow is different. Confusing is subjective because having external contributors to work on main, which targets an unstable upstream version that can break, can also be confusing. Authors of breaking changes in OpenSearch for the next major release don't have a place to fix downstream plugins, even if they wanted to.

This is solved by the new proposal with 3.0 branch. https://github.com/opensearch-project/.github/blob/main/RELEASING.md#plugin-branching is incorrect.

We can update it.

dblock commented 2 years ago

Thanks @joshuali925, what you propose would work for the problems I outlined above from the point of view of the sql developers (you). However, you make assumptions that I disagree with.

is easier for us to follow creates some downsides for us

Prioritizing custom needs of a team of developers that work on 1/10th of the product (SQL) at the expense of another team (infra), is not the goal of this open-source project.

each team should have the ability to decide what branching strategy works the best, because development strategy can vary by each team and plugin

Disagree. This comes at the expense of another team. Custom development strategy makes it more difficult to ship 1 product called OpenSearch, which includes a set of components, one of which is SQL.

Core can introduce arbitrary number of breaking changes in main before 3.0 release is ready. Having continuous integration on our development branch with an unstable upstream branch will cause troubleshooting and developing overhead.

This assumes the same product as ODFE where you got a stable core first from ES, then had to build plugins on top of it. This assumption is incorrect, as both develop on the same schedule.

I agree that the team behind the distribution build (@bbarani's) can work around every one of your customizations. But they have bigger priorities (e.g. Windows distribution), so you'll have to start by contributing work in the build tooling if you want to be more convincing. Try fixing the newly added auto-version increments process in https://github.com/opensearch-project/opensearch-build/issues/1375? Make some PRs?

penghuo commented 2 years ago

Hi @dblock

Disagree. This comes at the expense of another team. Custom development strategy makes it more difficult to ship 1 product called OpenSearch, which includes a set of components, one of which is SQL.

The only difference in @joshuali925's proposal is use 3.0 branch instead of main for next 3.0 release. It is required a manifest as today, right? Core team and plugin team could have continuous integration then.

Plugin pain points with the current proposal

What do you think the plugin pain points? Does it make sense?

joshuali925 commented 2 years ago

@dblock Thanks for taking a look and replying. I also want to emphasize that the new proposal is essentially "using explicit version branch instead of main branch for next major version integration", so from infra's perspective the change should be trivial.

Prioritizing custom needs of a team of developers that work on 1/10th of the product (SQL) at the expense of another team (infra), is not the goal of this open-source project.

The only custom need I see would be that manifest will target to 3.0 and future major release branches instead of always main. If this is causing issues for infra I can definitely create PRs to handle/automate it. If you see any additional work the new proposal will cause for infra team, please let me know.

While I can only represent myself as a developer in the SQL team, I do see similar concerns in previous conversations and I believe the pain points will exist for other plugins as well. Let me know if I misunderstood something, thanks.

dblock commented 2 years ago

from infra's perspective the change should be trivial.

This is just not the case. We've repeatedly heard from @bbarani's team how it is not trivial. For example, https://github.com/opensearch-project/opensearch-plugins/issues/142#issuecomment-1171645148 says that automation is difficult when repos follow different branching standards. As an example, I offered https://github.com/opensearch-project/opensearch-build/issues/1375 - please automate the version increments on the sql repo following your proposed branching strategy and we can evaluate the trivial nature of that change.

joshuali925 commented 2 years ago

please automate the version increments on the sql repo following your proposed branching strategy

Please see here

workflow file: https://github.com/joshuali925/sql/blob/cccf0cbfab9fb344224de218902a9635c8c6e716/.github/workflows/increment-plugin-versions.yml
workflow run: https://github.com/joshuali925/sql/actions/runs/2885478526
PR created: https://github.com/joshuali925/sql/pull/1

This run used the version in main branch of core (next major release version, which is 3.0), created a 3.0 branch in my forked SQL repo from main, then sent a PR to bump code in that branch to 3.0.0.

Given that there is already a workflow to bump non-main branches, I would think it is trivial to remove main from existing workflow and append this version to create a separate 3.0 branch, instead of bumping main to 3.0. Note that I only spent ~10 minutes on this, so this workflow is only meant to show my idea and does not mean the workflow is perfect, but please do raise any concerns.

Edit: fix links

dblock commented 2 years ago

@joshuali925 Thanks. My biggest concerns are around who will run this workflow or how it's going to be auto-triggered based on release tags as part of the full distribution process, or how it will triggered from the workflow being developed centrally. I'll park the specifics for after https://github.com/prudhvigodithi/opensearch-build/blob/de56c6ef3ca4953fdb0bc07f81f2b2980c1fd3ec/.github/workflows/increment-plugin-versions.yml is put to production for all the repos that will just work (TM).

joshuali925 commented 2 years ago

@dblock Sure. I would like to point out that the new proposal changes the branch name used during workflow executation. It does not change or affect when should the workflow be triggered, so this concern will exist regardless.

for all the repos that will just work

A side note/question, I asked initially why do we use gradle instead of other generic solution (https://github.com/opensearch-project/dashboards-reports/pull/391#pullrequestreview-1034850386), and the response from @prudhvigodithi was

gradle is better as the solution would be generic across for all repos

But I do see the non-Java repos are left out in increment-plugin-versions.yml, I felt this would be additional work to enforce a Java specific solution to those repos?

Additionally I see gradle paths for mono-repos are hard coded in the workflow, then cd'ed into before running gradlew.

          - {repo: dashboards-reports, path: reports-scheduler}
          - {repo: notifications, path: notifications}
          - {repo: observability, path: opensearch-observability}
...
          if [ ${{ matrix.entry.path }} ]; then
            echo "The gradle path is ${{ matrix.entry.path }}"
            cd ${{ matrix.entry.path }}
          fi
          ./gradlew updateVersion -DnewVersion=${{ env.OPENSEARCH_VERSION }}

In my opinion a better way would be something like cd "$(dirname "$(grep -rl --include 'build.gradle' 'task updateVersion' | head -n 1)")" so there is no need to hardcode a path by each mono-repo (and worry if directories are renamed) as long as the gradle task is the same. Maybe issues like this are also making it harder for infra to manage?

Anyways my point is that I'm definitely willing to help if we are creating additional work or pain points for infra, but we would also like our pain points to be considered. Thanks.

dblock commented 2 years ago

@dblock Sure. I would like to point out that the new proposal changes the branch name used during workflow executation. It does not change or affect when should the workflow be triggered, so this concern will exist regardless.

With matching branches there's only 1 single workflow in opensearch-build for all products, so this problem doesn't exist.

A side note/question, I asked initially why do we use gradle instead of other generic solution (opensearch-project/dashboards-reports#391 (review)), and the response from @prudhvigodithi was

gradle is better as the solution would be generic across for all repos

But I do see the non-Java repos are left out in increment-plugin-versions.yml, I felt this would be additional work to enforce a Java specific solution to those repos?

OpenSearch is all gradle (even security moved to gradle). OpenSearch Dashboards is all npm/javascript-based. I imagine @prudhvigodithi is planning to make a second workflow for those.

Additionally I see gradle paths for mono-repos are hard coded in the workflow, then cd'ed into before running gradlew.

In my opinion a better way would be something like cd "$(dirname "$(grep -rl --include 'build.gradle' 'task updateVersion' | head -n 1)")" so there is no need to hardcode a path by each mono-repo (and worry if directories are renamed) as long as the gradle task is the same. Maybe issues like this are also making it harder for infra to manage?

Yes they do, all the time. Monorepos is another ~pain point~ dead horse being beaten in https://github.com/opensearch-project/opensearch-build/issues/2188.

Anyways my point is that I'm definitely willing to help if we are creating additional work or pain points for infra, but we would also like our pain points to be considered. Thanks.

You can see by the lengthy discussions in this issue, and in the monorepo issue that your pain points are considered and being worked around when possible, as with the path example above.

prudhvigodithi commented 2 years ago

Hey @joshuali925 for dashboard there is discussion already in progress to come up with a solution to have an auto increment, similar to OpenSearch, still a solution is not finalized though.

In my opinion a better way would be something like cd "$(dirname "$(grep -rl --include 'build.gradle' 'task updateVersion' | head -n 1)")" so there is no need to hardcode a path by each mono-repo (and worry if directories are renamed) as long as the gradle task is the same. Maybe issues like this are also making it harder for infra to manage?

There is a problem there there is .gradlew wrapper inside the sub-folder, to triggering the updateVersion should happen using gradlew, instead of adding lot of shell code for the workflow, I have added the matrix way.

Thank you

joshuali925 commented 2 years ago

With matching branches there's only 1 single workflow in opensearch-build for all products, so this problem doesn't exist.

@dblock I didn't fully get this, the current workflow follows matching branches proposal but it is not bumping core. I see there can be a gap between my understanding and how infra actually works, I'll try to understand the issue better and see how to make it easier for infra. Thanks.

already in progress to come up with a solution

@prudhvigodithi My question is for a generic task like version bump, why use two language specific solutions instead of one generic solution? The concern is, if they need to be different, then reports-scheduler should not bump the version of dashboards-reports. Maybe we can continue this discussion there.

there is .gradlew wrapper inside the sub-folder

Project root directory contains both gradlew and build.gradle, so this will work as long as task updateVersion is defined in the top level gradle project (rather than defined in a sub-project's build.gradle, which shouldn't be the case anyways).

The case for updateVersion is an example and is trivial, but for more complex tasks, creating a solution per each mono-repo component can be a lot of extra work. This work may or may not be necessary since I don't have a complete understanding, but if what we are doing is causing issues for infra team, feel free to let me know and I'll help. Thanks.

dblock commented 2 years ago

already in progress to come up with a solution

@prudhvigodithi My question is for a generic task like version bump, why use two language specific solutions instead of one generic solution?

Because OpenSearch and OpenSearch Dashboards are different products and use different tooling to do similar things. Adding a third generic solution to each repo (e.g. Python) creates spaghetti.

dblock commented 1 year ago

Let's close this when main in all components is building 3.0, and 2.x is building next 2.x. I copy-pasted this into all related tickets:

[ ] main is building 3.0
[ ] 2.x is building next 2.x, which is currently 2.4.0
[ ] 2.3 is building next 2.3, which is 2.3.1
[ ] manifests in https://github.com/opensearch-project/opensearch-build/tree/main/manifests include all 3

lezzago commented 1 year ago

The Notification plugin is facing constant blockers to bump the main branch to 3.0 due to its dependencies. The Notification plugin has a hard dependency on the OpenSearch core and Common-utils packages. If they do not build successfully, Notification fails. Additionally, we have security tests we need to pass that is executed by running a docker image with the security plugin. To bring up the docker image, we then need OpenSearch core, Common-utils, Job Scheduler, Security, and Performance-Analyzer packages to build successfully in 3.0.

This means for Notification to have a PR to bump the version to 3.0 in main, we need OpenSearch core, Common-utils, Job Scheduler, Security, and Performance-Analyzer packages to build successfully.

However it is hard to get all these packages to build successfully, in 3.0 as there are consistent breaking changes coming from OpenSearch core and the other packages dont have enough time to fix the issues to finally have a successful build in all of those packages before another breaking change occurs.

I suggest that the build manifest for 3.0 points to a specific commit in the OpenSearch core package and gets updated to the latest commit every 2 weeks. By doing this, it will give the owners of the other packages enough time to fix the issues in their packages without another issue popping up later again.

@dblock @bbarani @CEHENKLE @peterzhuamazon Please let me know your thoughts on the above.

cliu123 commented 1 year ago

Build on 3.0 fails pretty frequently, which blocks development on main branch. It is failing right now. That actually blocks all development when it happens as all the development is done on main and backported to other branches. This actually slows down everyone.

I like @lezzago's idea to point 3.0 to a stable commit in OpenSearch core package to stablize main branch build.

seraphjiang commented 1 year ago

+1 @lezzago 's stable commit

there should be golden build from nightly build we identify as stable build.

bbarani commented 1 year ago

@lezzago I understand your concern but running your CI against the stale core code commit defeats the purpose of the continuous CI. The whole idea of aligning with core branching strategy is to fail fast but I do see it impacting the developer velocity when using a new major branch (3.0 in this case) for development. We are already discussing possible options to reduce the blast radius on this issue and I would recommend you to add your comments there as well.

opensearch-project / opensearch-plugins