openshift / cluster-etcd-operator

Operator to manage the lifecycle of the etcd members of an OpenShift cluster
Apache License 2.0
95 stars 127 forks source link

ETCD-636: add automated backup sidecar #1287

Closed Elbehery closed 2 months ago

Elbehery commented 2 months ago

This PR add an etcd backup sidecar container to the etcd pod manifest.

The container copies the snapshot state upon changes from the etcd data dir into backup dir.

fixes https://issues.redhat.com/browse/ETCD-636

cc @openshift/openshift-team-etcd

openshift-ci-robot commented 2 months ago

@Elbehery: This pull request references ETCD-636 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1287): >This PR add an etcd backup sidecar container to the etcd pod manifest. > >The container copies the snapshot state upon changes from the etcd data dir into backup dir. > >fixes https://issues.redhat.com/browse/ETCD-636 > >cc @openshift/openshift-team-etcd Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci[bot] commented 2 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Elbehery

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/cluster-etcd-operator/blob/master/OWNERS)~~ [Elbehery] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
Elbehery commented 2 months ago

/retest

Elbehery commented 2 months ago

/label tide/merge-method-squash

Elbehery commented 2 months ago

/test e2e-operator

openshift-ci[bot] commented 2 months ago

@Elbehery: The following commands are available to trigger required jobs:

The following commands are available to trigger optional jobs:

Use /test all to run the following jobs that were automatically triggered:

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1287#issuecomment-2218743457): >/test ? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
Elbehery commented 2 months ago

failures are due to authentication to the image registry

{  release "release-latest" failed: could not watch pod: the pod ci-op-0zcbppy9/release-latest failed after 1m13s (failed containers: release): ContainerFailed one or more containers exited

Container release exited with code 1, reason Error
---
29a0873cc59feb1f2f00fc818420330b4210e10021601e3f3c63ba87cc790e5e oc-mirror
info: Loading sha256:fcad3a8a4a17cbc13e8eaaa8503b2ec3c5d4c006ddc39cc798daaba62d8691d9 operator-lifecycle-manager
info: Loading sha256:2dd0da125e0e23b5530863690b8a629d524d1a393b63b6c3ef27cb46f7a0f961 openstack-cluster-api-controllers
info: Loading sha256:fd1e3fb553482d31c4831b845aaf2ef68ea4976cb595b40f14e3dc014e74b2d1 operator-marketplace
info: Loading sha256:88e955c3aaf50a4c75f61ed34542354389386463c61a1931dabac61bacb6d056 operator-framework-tools
info: Loading sha256:dd5d0691c5135ebc8173c5b3486c4fd7b187b65b0375e557bfa35115c7660e74 service-ca-operator
info: Loading sha256:85d166c140588335aa7712be55565931d2f4eccbb62e0031e43f8f4976e3633b tests
info: Loading sha256:5a3b73c0132212ad41a9e40d3c09df29e67f56b1757bdc018d1fe60df7493c1a vsphere-cluster-api-controllers
info: Loading sha256:0f036c80d66513d8125850a11f4d3117888af17b6f2037eed07148a83875a91f vsphere-problem-detector
info: Included 190 images from 72 input operators into the release
error: failed to push image registry.build03.ci.openshift.org/ci-op-0zcbppy9/release:latest: uploading the source layer sha256:7a4643f5f2a50088993f8d8f43a8f86bc0c497a96e1323a5a5eaf051bfa8dcc8 failed: Patch "https://registry.build03.ci.openshift.org/v2/ci-op-0zcbppy9/release/blobs/uploads/175ca850-94e9-4f12-96bd-2215443abd65?_state=OK6IMXDu0oEqW_1kApl-CScd7gplm-bE0vJwspCQk7t7Ik5hbWUiOiJjaS1vcC0wemNicHB5OS9yZWxlYXNlIiwiVVVJRCI6IjE3NWNhODUwLTk0ZTktNGYxMi05NmJkLTIyMTU0NDNhYmQ2NSIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyNC0wNy0wOVQxMjoyOTo1MC45ODg4MTkyODhaIn0%3D": http2: Transport: cannot retry err [stream error: stream ID 2349; REFUSED_STREAM; received from peer] after Request.Body was written; define Request.GetBody to avoid this error
{"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"sigs.k8s.io/prow/pkg/entrypoint/run.go:84","func":"sigs.k8s.io/prow/pkg/entrypoint.Options.internalRun","level":"error","msg":"Error executing test process","severity":"error","time":"2024-07-09T12:30:21Z"}
---}
Elbehery commented 2 months ago

/retest-required

Elbehery commented 2 months ago

/retest-required

Elbehery commented 2 months ago

/retest-required

Elbehery commented 2 months ago

/retest-required

Elbehery commented 2 months ago

/test e2e-operator

deepsm007 commented 2 months ago

/test e2e-operator

Elbehery commented 2 months ago

/retest-required

jupierce commented 2 months ago

/test all

openshift-ci[bot] commented 2 months ago

@Elbehery: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp-qe-no-capabilities 803e77a8c9c9f26bda362fc8948fccac37c8be08 link false /test e2e-gcp-qe-no-capabilities
ci/prow/e2e-aws-ovn-single-node b4f991927bcc652948db38f8640d19d47372307b link true /test e2e-aws-ovn-single-node
ci/prow/e2e-aws-etcd-recovery b4f991927bcc652948db38f8640d19d47372307b link false /test e2e-aws-etcd-recovery
ci/prow/e2e-operator b4f991927bcc652948db38f8640d19d47372307b link true /test e2e-operator
ci/prow/e2e-aws-etcd-certrotation b4f991927bcc652948db38f8640d19d47372307b link false /test e2e-aws-etcd-certrotation
ci/prow/e2e-aws-ovn-etcd-scaling b4f991927bcc652948db38f8640d19d47372307b link true /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-metal-ovn-sno-cert-rotation-shutdown b4f991927bcc652948db38f8640d19d47372307b link false /test e2e-metal-ovn-sno-cert-rotation-shutdown
ci/prow/e2e-metal-ovn-ha-cert-rotation-shutdown b4f991927bcc652948db38f8640d19d47372307b link false /test e2e-metal-ovn-ha-cert-rotation-shutdown
ci/prow/e2e-operator-fips b4f991927bcc652948db38f8640d19d47372307b link false /test e2e-operator-fips

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).
Elbehery commented 2 months ago

/hold

Elbehery commented 2 months ago

closing this in favor of https://github.com/openshift/cluster-etcd-operator/pull/1301