Closed saikaranam-amazon closed 2 years ago
:white_check_mark: DCO Check Passed c5ff5261b102d8cf6fe83568daf6ab86711c9cde
:white_check_mark: Gradle Wrapper Validation success c5ff5261b102d8cf6fe83568daf6ab86711c9cde
:white_check_mark: Gradle Precommit success c5ff5261b102d8cf6fe83568daf6ab86711c9cde
start gradle check
:x: Gradle Check failure c5ff5261b102d8cf6fe83568daf6ab86711c9cde Log 504
I've seen these issues before:
* What went wrong:
Execution failed for task ':distribution:bwc:minor:buildBwcLinuxTar'.
> Building 1.0.0 didn't generate expected file /var/CITOOL/workflow/OpenSearch_CI/PR_Checks/Gradle_Check/search/distribution/bwc/minor/build/bwc/checkout-1.0/distribution/archives/linux-tar/build/distributions/opensearch-1.0.0-SNAPSHOT-linux-x64.tar.gz
It usually means the checked out codebase that's being built for bwc testing generated a version artifact that did not match the expected version. In this case, the build expected a 1.0.0 build but got something else (my money is on 1.0.1 since 1.0 branch is now at a staged bugfix release).
I haven't had a chance to circle back on this so I'm speculating that the bwc builder is either looking for (or needs to look for) the release tag to check out and build the appropriate bwc version. We inherited that code and release tags upstream always looked like vM.m.p
until OpenSearch decided to drop the github recommended v
in favor of just the release number M.m.p
. Perhaps the Bwc distribution builder is looking for v1.0.0 tag and not finding it so falling back to the 1.0 branch (which is now a staged 1.0.1 and :boom: ).
@adnapibar is familiar with the BWC builder so perhaps he can take a look?
We should re-add the v
in the tag if this is the case, see https://github.com/opensearch-project/.github/issues/35.
Is this feature a blocker for the 1.1 release or can it hold and bake a while for 1.2?
Is this feature a blocker for the 1.1 release or can it hold and bake a while for 1.2?
This is blocker for 1.1 release
Can you rebase your branch onto https://github.com/opensearch-project/OpenSearch/tree/1.x
This is blocker for 1.1 release
What is it blocking?
Can you rebase your branch onto https://github.com/opensearch-project/OpenSearch/tree/1.x
Done. Pushed the change
What is it blocking?
The changes are needed for the replication plugin releasing with 1.1
The changes are needed for the replication plugin releasing with 1.1
Do you have a merged PR on the replication plugin repo that depends on this logic?
:white_check_mark: Gradle Wrapper Validation success 81f64c4d5271944b354d5698becc87579fdfbc5e
:white_check_mark: DCO Check Passed 81f64c4d5271944b354d5698becc87579fdfbc5e
Do you have a merged PR on the replication plugin repo that depends on this logic?
Yes, changes are merged here: link
:white_check_mark: Gradle Precommit success 81f64c4d5271944b354d5698becc87579fdfbc5e
Yes, changes are merged here:
That change was merged 2 months ago; and it depends on a setting that has not been in OpenSearch core until today? How is this possible? Release is scheduled for less than a week and we're trying to backport a new setting that has no time to bake.
Huge -1 for this.
That change was merged 2 months ago; and it depends on a setting that has not been in OpenSearch core until today? How is this possible? Release is scheduled for less than a week and we're trying to backport a new setting that has no time to bake.
The changes related to this PR on OpenSearch was raised more than 1 month back after thorough testing from our side. As Opensearch team was tracking multiple releases in between 1.1 and 1.0, changes to previous version were prioritized and reviewed before this.
The changes related to this PR on OpenSearch was raised more than 1 month back...
The issue was opened 22 days ago. The PR was opened before the issue. The changes were only recently merged to main, and a backport just now opened (2 hrs ago) against 1.x less than a week before the planned release. This is too late. I'm a fan of the proposed change but this didn't have enough time to bake before the 1.1 release. Huge recommendation to bump to 1.2.
after thorough testing from our side.
Unfortunately that's not the primary core CI; and we're now playing gradle check
roulette in hopes that bwc picks a seed that passes all gates in order to wedge it in time for a release. This isn't a good approach as it only leads to knee jerk bugfix releases (which are not the easiest to do atm).
If we don't want to hold the feature for a 1.2 release then I recommend we hold the 1.1 release for at least a week to let this settle. I like the change, it's just too late to be wedging these in for a 1.1 release in less than a week. I'm going to mark this as a blocker for 1.1 so we have traceability. Let's hold on removing that blocker label until we're comfortable the changes have had time to bake.
start gradle check
@saikaranam-amazon Hi Sai! Can you tell us more about how your testing was done? If we're still seeing changes coming in against 1.1, how have you been able to validate the change against the full set?
Thanks! /C
:white_check_mark: Gradle Check success 81f64c4d5271944b354d5698becc87579fdfbc5e Log 510
Can you tell us more about how your testing was done? If we're still seeing changes coming in against 1.1, how have you been able to validate the change against the full set?
These changes are tested on test clusters from our side. I understand that the 1.x branch is active and seeing changes, but the current changes are contained addressing some of the issues identified during replication plugin testing. These shouldn't affect existing OS 1.x code base.
I think we can continue to proactively work and merge these incremental improvements in main and let them bake while figuring out a best path to release. That's the nice thing about our branch strategy, no reason to perfect everything before merging, and no reason to jump the gun releasing a feature that needs a little more time and due diligence.
I'm wondering if we can leverage some of the OverrideablePlugin
mechanisms that @setiah has been working on and implement these tlog settings through an extension in the CCR plugin? Alternatively we could look at adding a feature flag that only enables this when the flag is set? This would give more guardrails from users unknowingly setting something without fully understanding consequences.
Instead of implementing this IndexSetting in the core and baking it into the default TranslogDeletionPolicy can we refactor this PR to make the TranslogDeletionPolicy extendable (protected methods and variables) and implement the retention lease behavior in something like a CCRTranslogDeletionPolicy extends TranslogDeletionPolicy
within the CCR plugin? I think this would help core not introduce a new IndexSetting that has to be supported long term (along w/ BWC) when we're trying to make translog logic become an extendable feature anyway to support segment level replication.
This extended DeletionPolicy can be produced by the TranslogConfig
which has the IndexSettings available to it. The setting can then be implemented by the CCR plugin which implements a ReplicationEngine anyway (which can provide the custom DeletionPolicy). This totally decouples this logic into the one and only plugin that requires this setting allowing the plugin to implement its own data integrity tests (can you link that test plan here? maybe open a test plan issue in the cross-cluster-replication repo so it lives w/ the plugin code).
Would we ever wire up CCR without this enabled? I'm wondering if we need to even bother with making this a setting. We can wire up a subclass of TranslogDeletionPolicy
inside of ReplicationEngine
, meaning it would be default behavior if CCR is installed.
Would we ever wire up CCR without this enabled?
That's a great question. I'm not all that familiar with the cross-cluster-replication plugin mechanics. I think you spin up a follower cluster as opposed to a follower index? If that's the case then logically it seems all indexes in the leader cluster are replicated to the follower cluster? And if so what's the logic in adding index level fidelity to change the deletion policy? It seems a reasonable assertion that retaining uncompressed tlog ops based on retention lease to avoid decompressing lucene stored fields significantly increases translog overhead? So maybe it makes sense to retain this index setting on high churn indexes?
@nknize Are we looking at extracting changes from https://github.com/opensearch-project/OpenSearch/pull/1227 as well?
:white_check_mark: Gradle Wrapper Validation success 22ad35f997f955c09b5fabe9bb0ac776b7113ac7
:x: DCO Check Failed 22ad35f997f955c09b5fabe9bb0ac776b7113ac7
Run ./dev-tools/signoff-check.sh remotes/origin/1.x 22ad35f997f955c09b5fabe9bb0ac776b7113ac7
to check locally
Use git commit with -s
to add 'Signed-of-by: {EMAIL}' on impacted commits
:white_check_mark: Gradle Wrapper Validation success 3cb3577b2e552e812333c79ec5f66d2f0f0c407b
:white_check_mark: DCO Check Passed 3cb3577b2e552e812333c79ec5f66d2f0f0c407b
:white_check_mark: Gradle Precommit success 22ad35f997f955c09b5fabe9bb0ac776b7113ac7
:white_check_mark: Gradle Precommit success 3cb3577b2e552e812333c79ec5f66d2f0f0c407b
:white_check_mark: Gradle Wrapper Validation success 73235f85bca1ac9664eea5aeed142a1437a0bdd1
:white_check_mark: DCO Check Passed 73235f85bca1ac9664eea5aeed142a1437a0bdd1
:white_check_mark: Gradle Precommit success 73235f85bca1ac9664eea5aeed142a1437a0bdd1
:white_check_mark: Gradle Wrapper Validation success 18138a88e0eb26d4beaea38c32c5244ef7051916
:white_check_mark: DCO Check Passed 18138a88e0eb26d4beaea38c32c5244ef7051916
:white_check_mark: Gradle Precommit success 18138a88e0eb26d4beaea38c32c5244ef7051916
:white_check_mark: DCO Check Passed ed48381bfae62e0104617fcf0e0f79f1d5ef1c73
:white_check_mark: Gradle Wrapper Validation success ed48381bfae62e0104617fcf0e0f79f1d5ef1c73
:white_check_mark: Gradle Precommit success ed48381bfae62e0104617fcf0e0f79f1d5ef1c73
:white_check_mark: Gradle Wrapper Validation success 91be63ce4b67a1a3a2de9183d46f93273aea4803
:x: DCO Check Failed 91be63ce4b67a1a3a2de9183d46f93273aea4803
Run ./dev-tools/signoff-check.sh remotes/origin/1.x 91be63ce4b67a1a3a2de9183d46f93273aea4803
to check locally
Use git commit with -s
to add 'Signed-of-by: {EMAIL}' on impacted commits
:white_check_mark: DCO Check Passed 9d4b719a90091134b199b01246e53412d2c4b4fc
:white_check_mark: Gradle Wrapper Validation success 9d4b719a90091134b199b01246e53412d2c4b4fc
:white_check_mark: Gradle Precommit success 91be63ce4b67a1a3a2de9183d46f93273aea4803
Description
Backport of https://github.com/opensearch-project/OpenSearch/pull/1038
Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.