openshift / cluster-etcd-operator

Operator to manage the lifecycle of the etcd members of an OpenShift cluster
Apache License 2.0
96 stars 130 forks source link

ETCD-681: Add etcd-backup-server container within separate daemonset #1354

Open Elbehery opened 1 month ago

Elbehery commented 1 month ago

resolves https://issues.redhat.com/browse/ETCD-681

openshift-ci-robot commented 1 month ago

@Elbehery: This pull request references ETCD-681 which is a valid jira issue.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1354): >resolves https://issues.redhat.com/browse/ETCD-681 Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
Elbehery commented 1 month ago

/hold

still WIP

openshift-ci[bot] commented 1 month ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Elbehery

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/openshift/cluster-etcd-operator/blob/master/OWNERS)~~ [Elbehery] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
Elbehery commented 1 month ago

/label tide/merge-method-squash

Elbehery commented 1 month ago

/retest-required

Elbehery commented 1 month ago

/retest-required

Elbehery commented 1 month ago

Tested with this PR atop of 4.18.0-0.ci-2024-10-11-065556 OCP cluster

CR used

apiVersion: config.openshift.io/v1alpha1
kind: Backup
metadata:
  name: default
spec:
  etcd:
    schedule: "* * * * *"
    timeZone: "UTC"
    retentionPolicy:
      retentionType: RetentionNumber
      retentionNumber:
        maxNumberOfBackups: 3

backups are being taken on each master node

backup-server-daemon-set-nzx9p

melbeher@melbeher-mac Downloads % oc rsh -n openshift-etcd  pod/backup-server-daemon-set-nzx9p
Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:36 2024-10-11_173600
sh-5.1# 
sh-5.1# 
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:36 2024-10-11_173600
sh-5.1# 
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:36 2024-10-11_173600
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:36 2024-10-11_173600
drwxr-xr-x. 2 root root 96 Oct 11 17:37 2024-10-11_173700
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:37 2024-10-11_173700
drwxr-xr-x. 2 root root 96 Oct 11 17:38 2024-10-11_173800
drwxr-xr-x. 2 root root 96 Oct 11 17:39 2024-10-11_173900
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:40 2024-10-11_174000
drwxr-xr-x. 2 root root 96 Oct 11 17:41 2024-10-11_174100
drwxr-xr-x. 2 root root 96 Oct 11 17:42 2024-10-11_174200
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:45 2024-10-11_174500
drwxr-xr-x. 2 root root 96 Oct 11 17:46 2024-10-11_174600
drwxr-xr-x. 2 root root 96 Oct 11 17:47 2024-10-11_174700

pod/backup-server-daemon-set-f65np

melbeher@melbeher-mac Downloads % oc rsh -n openshift-etcd  pod/backup-server-daemon-set-f65np
Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:41 2024-10-11_174100
drwxr-xr-x. 2 root root 96 Oct 11 17:42 2024-10-11_174200
drwxr-xr-x. 2 root root 96 Oct 11 17:43 2024-10-11_174300
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:45 2024-10-11_174500
drwxr-xr-x. 2 root root 96 Oct 11 17:46 2024-10-11_174600
drwxr-xr-x. 2 root root 96 Oct 11 17:47 2024-10-11_174700

pod/backup-server-daemon-set-cbmvm

melbeher@melbeher-mac Downloads % oc rsh -n openshift-etcd  pod/backup-server-daemon-set-cbmvm
Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:41 2024-10-11_174100
drwxr-xr-x. 2 root root 96 Oct 11 17:42 2024-10-11_174200
drwxr-xr-x. 2 root root 96 Oct 11 17:43 2024-10-11_174300
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:42 2024-10-11_174200
drwxr-xr-x. 2 root root 96 Oct 11 17:43 2024-10-11_174300
drwxr-xr-x. 2 root root 96 Oct 11 17:44 2024-10-11_174400
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 11 17:45 2024-10-11_174500
drwxr-xr-x. 2 root root 96 Oct 11 17:46 2024-10-11_174600
drwxr-xr-x. 2 root root 96 Oct 11 17:47 2024-10-11_174700
Elbehery commented 1 month ago

During testing, I had issues with creating the correct ETCDCTL_KEY ETCDCTL_CERT names.

Therefore, has chosen to use init-container to create the correct names as ENV and expose them to the Etcd-backup-server container

the manifest used for testing

apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: etcd-auto-backup
  name: backup-server-daemon-set
  namespace: openshift-etcd
spec:
  selector:
    matchLabels:
      app: etcd-auto-backup
  template:
    metadata:
      labels:
        app: etcd-auto-backup
    spec:
      initContainers:
      - name: init-env
        image: stakater/base-alpine
        command:
        - /bin/bash
        - -c
        - |
          #!/bin/bash
          ETCDCTL_KEY="/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-NODE_NAME.key"
          ETCDCTL_CERT="/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-NODE_NAME.crt"
          currentNodeName=$NODE_NAME
          subStringToReplace="NODE_NAME"
          new_ETCDCTL_KEY=${ETCDCTL_KEY/$subStringToReplace/$currentNodeName}
          new_ETCDCTL_CERT=${ETCDCTL_CERT/$subStringToReplace/$currentNodeName}
          echo "ETCDCTL_KEY=$new_ETCDCTL_KEY" >> /shared/env-vars.sh
          echo "ETCDCTL_CERT=$new_ETCDCTL_CERT" >> /shared/env-vars.sh
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        volumeMounts:
        - name: shared-data
          mountPath: /shared
      containers:
      - command:
        - /bin/bash
        - -c
        - |
          #!/bin/bash
          set -o allexport
          if [[ -f /shared/env-vars.sh ]]; then
            source /shared/env-vars.sh
          fi
          exec cluster-etcd-operator backup-server \
          --enabled=true \
          --timezone=UTC \
          --schedule="* * * * *" \
          --type=RetentionNumber \
          --maxNumberOfBackups=3 \
          --endpoints=10.0.125.23:2379,10.0.76.40:2379,10.0.48.236:2379 \
          --backupPath=/var/lib/etcd-auto-backup
        env:
        - name: NODE_ip_10_0_76_40_us_west_1_compute_internal_ETCD_NAME
          value: ip-10-0-76-40.us-west-1.compute.internal
        - name: NODE_ip_10_0_48_236_us_west_1_compute_internal_IP
          value: 10.0.48.236
        - name: ETCD_CIPHER_SUITES
          value: TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256
        - name: ETCD_DATA_DIR
          value: /var/lib/etcd
        - name: ETCD_SOCKET_REUSE_ADDRESS
          value: "true"
        - name: ETCD_IMAGE
          value: registry.build09.ci.openshift.org/ci-ln-7m2pl6t/stable@sha256:a5ffe3489a5c049cb2bae31ba55fa7e3a7654d93d833a78f6c0506d2d7c1b272
        - name: ETCDCTL_API
          value: "3"
        - name: NODE_ip_10_0_76_40_us_west_1_compute_internal_IP
          value: 10.0.76.40
        - name: ALL_ETCD_ENDPOINTS
          value: https://10.0.125.23:2379,https://10.0.48.236:2379,https://10.0.76.40:2379
        - name: ETCD_INITIAL_CLUSTER_STATE
          value: existing
        - name: ETCD_QUOTA_BACKEND_BYTES
          value: "8589934592"
        - name: NODE_ip_10_0_125_23_us_west_1_compute_internal_ETCD_URL_HOST
          value: 10.0.125.23
        - name: NODE_ip_10_0_76_40_us_west_1_compute_internal_ETCD_URL_HOST
          value: 10.0.76.40
        - name: ETCD_ENABLE_PPROF
          value: "true"
        - name: ETCD_EXPERIMENTAL_MAX_LEARNERS
          value: "3"
        - name: ETCD_EXPERIMENTAL_WATCH_PROGRESS_NOTIFY_INTERVAL
          value: 5s
        - name: ETCDCTL_CACERT
          value: /etc/kubernetes/static-pod-certs/configmaps/etcd-all-bundles/server-ca-bundle.crt
        - name: ETCD_EXPERIMENTAL_WARNING_APPLY_DURATION
          value: 200ms
        - name: NODE_ip_10_0_48_236_us_west_1_compute_internal_ETCD_URL_HOST
          value: 10.0.48.236
        - name: ETCD_HEARTBEAT_INTERVAL
          value: "100"
        - name: ETCDCTL_ENDPOINTS
          value: https://10.0.125.23:2379,https://10.0.48.236:2379,https://10.0.76.40:2379
        - name: ETCD_ELECTION_TIMEOUT
          value: "1000"
        - name: NODE_ip_10_0_48_236_us_west_1_compute_internal_ETCD_NAME
          value: ip-10-0-48-236.us-west-1.compute.internal
        - name: NODE_ip_10_0_125_23_us_west_1_compute_internal_IP
          value: 10.0.125.23
        - name: NODE_ip_10_0_125_23_us_west_1_compute_internal_ETCD_NAME
          value: ip-10-0-125-23.us-west-1.compute.internal
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        image: registry.build09.ci.openshift.org/ci-ln-7m2pl6t/stable@sha256:07725b1d583f4bd4afcbe13121b57c857db961e377cad4f5345b864f7ba4f08e
        imagePullPolicy: IfNotPresent
        name: etcd-backup-server
        resources: {}
        securityContext:
          privileged: true
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: FallbackToLogsOnError
        volumeMounts:
        - mountPath: /var/lib/etcd
          name: data-dir
        - mountPath: /etc/kubernetes
          name: config-dir
        - mountPath: /var/lib/etcd-auto-backup
          name: etcd-auto-backup-dir
        - mountPath: /etc/kubernetes/static-pod-certs
          name: cert-dir
        - name: shared-data
          mountPath: /shared
      dnsPolicy: ClusterFirst
      nodeSelector:
        node-role.kubernetes.io/master: ""
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
      volumes:
      - hostPath:
          path: /var/lib/etcd
          type: ""
        name: data-dir
      - hostPath:
          path: /etc/kubernetes
          type: ""
        name: config-dir
      - hostPath:
          path: /var/lib/etcd-auto-backup
          type: ""
        name: etcd-auto-backup-dir
      - hostPath:
          path: /etc/kubernetes/static-pod-resources/etcd-certs
          type: ""
        name: cert-dir
      - name: shared-data
        emptyDir: {}
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

Also all these ENVs are not needed, just the init-container is enough

Elbehery commented 1 month ago

/test unit

Elbehery commented 1 month ago

Result of test final test

CR Used

apiVersion: config.openshift.io/v1alpha1
kind: Backup
metadata:
  name: default
spec:
  etcd:
    schedule: "* * * * *"
    timeZone: "UTC"
    retentionPolicy:
      retentionType: RetentionNumber
      retentionNumber:
        maxNumberOfBackups: 3

Result from openshift-etcd namespace

oc get all -n openshift-etcd                                                              
Warning: apps.openshift.io/v1 DeploymentConfig is deprecated in v4.14+, unavailable in v4.10000+
NAME                                                               READY   STATUS      RESTARTS   AGE
pod/backup-server-daemon-set-555xg                                 1/1     Running     1          45m
pod/backup-server-daemon-set-85ptk                                 1/1     Running     1          45m
pod/backup-server-daemon-set-mcbqc                                 1/1     Running     1          45m

NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/etcd   ClusterIP   172.30.154.25   <none>        2379/TCP,9979/TCP   86m

NAME                                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                     AGE
daemonset.apps/backup-server-daemon-set   3         3         3       3            3           node-role.kubernetes.io/master=   45m

Comment


oc describe pod/backup-server-daemon-set-555xg -n openshift-etcd

Events:
  Type     Reason           Age                 From               Message
  ----     ------           ----                ----               -------
  Normal   Scheduled        48m                 default-scheduler  Successfully assigned openshift-etcd/backup-server-daemon-set-555xg to ip-10-0-103-219.us-west-2.compute.internal
  Normal   AddedInterface   48m                 multus             Add eth0 [10.128.0.89/23] from ovn-kubernetes
  Normal   Pulled           48m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          48m                 kubelet            Created container init-env
  Normal   Started          48m                 kubelet            Started container init-env
  Normal   Pulled           48m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          48m                 kubelet            Created container etcd-backup-server
  Normal   Started          48m                 kubelet            Started container etcd-backup-server
  Warning  NodeNotReady     32m                 node-controller    Node is not ready
  Warning  FailedMount      30m (x6 over 31m)   kubelet            MountVolume.SetUp failed for volume "kube-api-access-ctrmh" : [object "openshift-etcd"/"kube-root-ca.crt" not registered, object "openshift-etcd"/"openshift-service-ca.crt" not registered]
  Warning  NetworkNotReady  30m (x10 over 31m)  kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?
  Normal   AddedInterface   30m                 multus             Add eth0 [10.128.0.89/23] from ovn-kubernetes
  Normal   Pulled           30m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          30m                 kubelet            Created container init-env
  Normal   Started          30m                 kubelet            Started container init-env
  Normal   Pulled           30m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          30m                 kubelet            Created container etcd-backup-server
  Normal   Started          30m                 kubelet            Started container etcd-backup-server

oc describe pod/backup-server-daemon-set-85ptk -n openshift-etcd

Events:
  Type     Reason           Age                 From               Message
  ----     ------           ----                ----               -------
  Normal   Scheduled        50m                 default-scheduler  Successfully assigned openshift-etcd/backup-server-daemon-set-85ptk to ip-10-0-121-90.us-west-2.compute.internal
  Normal   AddedInterface   50m                 multus             Add eth0 [10.129.0.65/23] from ovn-kubernetes
  Normal   Pulled           50m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          50m                 kubelet            Created container init-env
  Normal   Started          50m                 kubelet            Started container init-env
  Normal   Pulled           50m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          50m                 kubelet            Created container etcd-backup-server
  Normal   Started          50m                 kubelet            Started container etcd-backup-server
  Warning  NodeNotReady     41m                 node-controller    Node is not ready
  Warning  FailedMount      38m (x6 over 39m)   kubelet            MountVolume.SetUp failed for volume "kube-api-access-dttcb" : [object "openshift-etcd"/"kube-root-ca.crt" not registered, object "openshift-etcd"/"openshift-service-ca.crt" not registered]
  Warning  NetworkNotReady  38m (x10 over 39m)  kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?
  Normal   AddedInterface   38m                 multus             Add eth0 [10.129.0.65/23] from ovn-kubernetes
  Normal   Pulled           38m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          38m                 kubelet            Created container init-env
  Normal   Started          38m                 kubelet            Started container init-env
  Normal   Pulled           38m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          38m                 kubelet            Created container etcd-backup-server
  Normal   Started          38m                 kubelet            Started container etcd-backup-server

oc describe pod/backup-server-daemon-set-mcbqc -n openshift-etcd

Events:
  Type     Reason           Age                 From               Message
  ----     ------           ----                ----               -------
  Normal   Scheduled        52m                 default-scheduler  Successfully assigned openshift-etcd/backup-server-daemon-set-mcbqc to ip-10-0-20-183.us-west-2.compute.internal
  Normal   AddedInterface   52m                 multus             Add eth0 [10.130.0.77/23] from ovn-kubernetes
  Normal   Pulled           52m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          52m                 kubelet            Created container init-env
  Normal   Started          52m                 kubelet            Started container init-env
  Normal   Pulled           52m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          52m                 kubelet            Created container etcd-backup-server
  Normal   Started          52m                 kubelet            Started container etcd-backup-server
  Warning  NodeNotReady     47m                 node-controller    Node is not ready
  Warning  FailedMount      45m (x6 over 45m)   kubelet            MountVolume.SetUp failed for volume "kube-api-access-pbcx4" : [object "openshift-etcd"/"kube-root-ca.crt" not registered, object "openshift-etcd"/"openshift-service-ca.crt" not registered]
  Warning  NetworkNotReady  45m (x10 over 45m)  kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: no CNI configuration file in /etc/kubernetes/cni/net.d/. Has your network provider started?
  Normal   AddedInterface   45m                 multus             Add eth0 [10.130.0.77/23] from ovn-kubernetes
  Normal   Pulled           45m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          45m                 kubelet            Created container init-env
  Normal   Started          45m                 kubelet            Started container init-env
  Normal   Pulled           45m                 kubelet            Container image "registry.build09.ci.openshift.org/ci-ln-8x2i3jk/stable@sha256:38755ec1b503a120c70a314e947a7080dd936830ba4d5ae972005fea03e3858e" already present on machine
  Normal   Created          45m                 kubelet            Created container etcd-backup-server
  Normal   Started          45m                 kubelet            Started container etcd-backup-server
Elbehery commented 1 month ago

oc rsh -n openshift-etcd pod/backup-server-daemon-set-555xg

Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 18:40 2024-10-12_184000
drwxr-xr-x. 2 root root 96 Oct 12 18:41 2024-10-12_184100
drwxr-xr-x. 2 root root 96 Oct 12 18:42 2024-10-12_184200
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 18:56 2024-10-12_185600
drwxr-xr-x. 2 root root 96 Oct 12 18:57 2024-10-12_185700
drwxr-xr-x. 2 root root 96 Oct 12 18:58 2024-10-12_185800

oc rsh -n openshift-etcd pod/backup-server-daemon-set-85ptk

Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 18:41 2024-10-12_184100
drwxr-xr-x. 2 root root 96 Oct 12 18:42 2024-10-12_184200
drwxr-xr-x. 2 root root 96 Oct 12 18:43 2024-10-12_184300
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 18:57 2024-10-12_185700
drwxr-xr-x. 2 root root 96 Oct 12 18:58 2024-10-12_185800
drwxr-xr-x. 2 root root 96 Oct 12 18:59 2024-10-12_185900

oc rsh -n openshift-etcd pod/backup-server-daemon-set-mcbqc

Defaulted container "etcd-backup-server" out of: etcd-backup-server, init-env (init)
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 18:59 2024-10-12_185900
drwxr-xr-x. 2 root root 96 Oct 12 19:00 2024-10-12_190000
drwxr-xr-x. 2 root root 96 Oct 12 19:01 2024-10-12_190100
sh-5.1# 
sh-5.1# ls -l /var/lib/etcd-auto-backup/
total 0
drwxr-xr-x. 2 root root 96 Oct 12 19:00 2024-10-12_190000
drwxr-xr-x. 2 root root 96 Oct 12 19:01 2024-10-12_190100
drwxr-xr-x. 2 root root 96 Oct 12 19:02 2024-10-12_190200
Elbehery commented 1 month ago

/hold cancel

Elbehery commented 1 month ago

/assign @JoelSpeed /assign @vrutkovs /assign @p0lyn0mial

Elbehery commented 1 month ago

/retest-required

Elbehery commented 1 month ago

/retest-required

openshift-ci-robot commented 1 month ago

@Elbehery: This pull request references ETCD-681 which is a valid jira issue.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1354): >resolves https://issues.redhat.com/browse/ETCD-681 > >- [ ] Reacts to Deltas in CR >- [ ] add the error handling back and a test for it > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci-robot commented 1 month ago

@Elbehery: This pull request references ETCD-681 which is a valid jira issue.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1354): >resolves https://issues.redhat.com/browse/ETCD-681 > >- [ ] Reacts to Deltas in CR >- [x] add the error handling back and a test for it > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
openshift-ci-robot commented 1 month ago

@Elbehery: This pull request references ETCD-681 which is a valid jira issue.

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1354): >resolves https://issues.redhat.com/browse/ETCD-681 > >- [x] Reacts to Deltas in CR >- [x] add the error handling back and a test for it > Instructions for interacting with me using PR comments are available [here](https://prow.ci.openshift.org/command-help?repo=openshift%2Fcluster-etcd-operator). If you have questions or suggestions related to my behavior, please file an issue against the [openshift-eng/jira-lifecycle-plugin](https://github.com/openshift-eng/jira-lifecycle-plugin/issues/new) repository.
Elbehery commented 1 month ago

/retest-required

Elbehery commented 1 month ago

/retest-required

Elbehery commented 4 weeks ago

/retest-required

Elbehery commented 4 weeks ago

/retest

Elbehery commented 4 weeks ago

/test ci/prow/e2e-operator-fips

openshift-ci[bot] commented 4 weeks ago

@Elbehery: The specified target(s) for /test were not found. The following commands are available to trigger required jobs:

The following commands are available to trigger optional jobs:

Use /test all to run the following jobs that were automatically triggered:

In response to [this](https://github.com/openshift/cluster-etcd-operator/pull/1354#issuecomment-2416938493): >/test ci/prow/e2e-operator-fips Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
Elbehery commented 4 weeks ago

/test e2e-operator

Elbehery commented 3 weeks ago

/retest-required

Elbehery commented 3 weeks ago

/test e2e-aws-ovn-single-node

Elbehery commented 3 weeks ago

/retest-required

Elbehery commented 3 weeks ago

/retest-required

openshift-ci[bot] commented 3 weeks ago

@Elbehery: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-etcd-recovery 564eb9ac3667ae9801c9276b3ac251f1d243be2b link false /test e2e-aws-etcd-recovery
ci/prow/e2e-metal-ovn-sno-cert-rotation-shutdown 564eb9ac3667ae9801c9276b3ac251f1d243be2b link false /test e2e-metal-ovn-sno-cert-rotation-shutdown
ci/prow/e2e-metal-ovn-ha-cert-rotation-shutdown 564eb9ac3667ae9801c9276b3ac251f1d243be2b link false /test e2e-metal-ovn-ha-cert-rotation-shutdown
ci/prow/e2e-aws-ovn-etcd-scaling 564eb9ac3667ae9801c9276b3ac251f1d243be2b link true /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-aws-ovn-single-node 564eb9ac3667ae9801c9276b3ac251f1d243be2b link true /test e2e-aws-ovn-single-node

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository. I understand the commands that are listed [here](https://go.k8s.io/bot-commands).