CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.93k stars 592 forks source link

The pgBackRest pod is not scheduled on another node ? #2972

Open khalifaould opened 2 years ago

khalifaould commented 2 years ago

In the event that the node is lost or the pgbackrest pod is running, the pod is not rescheduled on one of the other available nodes and it is no longer possible to make a backup or restore. Is it normal ? Is it possible to have several replicas of this pod as is the case for the base pods?

jkatz commented 2 years ago

There is not enough information here to begin troubleshooting. Information that is helpful for this can be found here: https://github.com/CrunchyData/postgres-operator/issues/new?template=bug_report.md

khalifaould commented 2 years ago

As part of our resilience tests we have a cluster with 3 master nodes on which we launched a crunchydata cluster with 3 replicas. The list of pods image

We simulate the loss of the node on which the pgbackrest (dai-prod-db-repo-host-0) is launched.

We note that this pod is not rescheduled on one of the other available nodes

image

In this case we can no longer make a backup or restore?

Is it possible to have several replicas of this pod as is the case for the base pods?

khalifaould commented 2 years ago
Cluster Config 

`apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: db-cluster
  namespace: namespace
spec:
  image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.4-1
  postgresVersion: 13
  instances:
    - name: instance1
      replicas: 3
      dataVolumeClaimSpec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: 3Gi
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - topologyKey: kubernetes.io/hostname
            labelSelector:
              matchLabels:
                postgres-operator.crunchydata.com/cluster: db-cluster
                postgres-operator.crunchydata.com/instance-set: instance1
  users:
    - name: user
      databases:
        - db_name
      options: "SUPERUSER"
  databaseInitSQL:
    key: create-schema.sql
    name: create-schema
  backups:
    pgbackrest:
      global:
        repo1-retention-full: "5"
      manual:
        repoName: repo1
        options:
          - --type=full
      image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.35-0
      repos:
        - name: repo1
          schedules:
            full: "0 0 * * *"
          volume:      
            volumeClaimSpec:
              accessModes:
                - "ReadWriteOnce"
              resources:
                requests:
                  storage: 3Gi`
andrewlecuyer commented 2 years ago

@khalifaould does the repo Pod that cannot be rescheduled get stuck in a "pending" status? If so, can you provide the output of describing the unschedulable repo Pod via kubectl describe?

khalifaould commented 2 years ago

The pod is not reprogrammed on one of the other nodes and that's the problem !! Looks like there is an affinity with the node it first launched on

andrewlecuyer commented 2 years ago

Even when Kubernetes is unable to schedule a Pod, it is still possible to describe/view/etc. the Pod (which should be in a "pending" status). Therefore, please provide the output of describing the repo Pod, e.g. kubectl describe pod dai-prod-db-repo-host-0.

Additionally, please provide the following details as requested so that we can properly troubleshoot the issue per your specific deployment:

My thinking here is that the volume being utilized by the repo is in a specific zone where is only available to the node that is being terminated (i.e. the other two nodes are in other zones, and therefore unable to mount that same PVC).

khalifaould commented 2 years ago

Platform: Rancher K3S Platform Version: v1.19.7+k3s1 PGO Image Tag: ubi8-5.0.3-0 Postgres Version : 13 Storage: Default( local-path)

The dai-prod-db-repo-host-0 pod is not status pending but Terminating. kubectl describe pod dai-prod-db-repo-host-0

Name: dai-prod-db-repo-host-0 Namespace: namespace Priority: 0 Node: ip-172-16-60-231.eu-west-1.compute.internal/172.16.60.231 Start Time: Thu, 27 Jan 2022 16:32:26 +0000 Labels: controller-revision-hash=dai-prod-db-repo-host-768c6b8559 postgres-operator.crunchydata.com/cluster=dai-prod-db postgres-operator.crunchydata.com/data=pgbackrest postgres-operator.crunchydata.com/pgbackrest= postgres-operator.crunchydata.com/pgbackrest-dedicated= statefulset.kubernetes.io/pod-name=dai-prod-db-repo-host-0 Annotations: Status: Terminating (lasts 141m) Termination Grace Period: 30s IP: 10.42.1.247 IPs: IP: 10.42.1.247 Controlled By: StatefulSet/dai-prod-db-repo-host Init Containers: nss-wrapper-init: Container ID: docker://f633347a36fa16b4380ab037e8f15178df998c2d573f50b6412bdd6e8b5d8c2a Image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.35-0 Image ID: docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest@sha256:e70691aa7c4913e2c353bfe7e3779e1774cc6320be10fb1d6f32de8a10510669 Port: Host Port: Command: bash -c NSS_WRAPPER_SUBDIR=postgres CRUNCHY_NSS_USERNAME=postgres CRUNCHY_NSS_USER_DESC="postgres" /opt/crunchy/bin/nss_wrapper.sh State: Terminated Reason: Completed Exit Code: 0 Started: Thu, 27 Jan 2022 16:32:27 +0000 Finished: Thu, 27 Jan 2022 16:32:27 +0000 Ready: True Restart Count: 0 Environment: Mounts: /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-dswpn (ro) Containers: pgbackrest: Container ID: docker://fbbaeb11373840b2a0f93d02749d37ef0bae1db5253ece4f8794215d65536588 Image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.35-0 Image ID: docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest@sha256:e70691aa7c4913e2c353bfe7e3779e1774cc6320be10fb1d6f32de8a10510669 Port: Host Port: Command: /usr/sbin/sshd -D -e State: Running Started: Thu, 27 Jan 2022 16:32:28 +0000 Ready: True Restart Count: 0 Liveness: tcp-socket :2022 delay=0s timeout=1s period=10s #success=1 #failure=3 Environment: LD_PRELOAD: /usr/lib64/libnss_wrapper.so NSS_WRAPPER_PASSWD: /tmp/nss_wrapper/postgres/passwd NSS_WRAPPER_GROUP: /tmp/nss_wrapper/postgres/group Mounts: /etc/pgbackrest/conf.d from pgbackrest-config (rw) /etc/ssh from ssh (ro) /pgbackrest/repo1 from repo1 (rw) /tmp from tmp (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-dswpn (ro) Conditions: Type Status Initialized True Ready False ContainersReady True PodScheduled True Volumes: ssh: Type: Projected (a volume that contains injected data from multiple sources) ConfigMapName: dai-prod-db-ssh-config ConfigMapOptional: SecretName: dai-prod-db-ssh SecretOptionalName: repo1: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: dai-prod-db-repo1 ReadOnly: false pgbackrest-config: Type: Projected (a volume that contains injected data from multiple sources) ConfigMapName: dai-prod-db-pgbackrest-config ConfigMapOptional: tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium:
SizeLimit: 16Mi default-token-dswpn: Type: Secret (a volume populated by a Secret) SecretName: default-token-dswpn Optional: false QoS Class: BestEffort Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events:

jansmets commented 5 months ago

My issue might be related. We had a (planned) power outage (executed ahead of schedule) which brought down all nodes. For some reason the repo-host-0 pod remained in a Terminating state once everything came back online.

At that point the WAL kept on filling up until the disk ran full...