CrunchyData / postgres-operator

Production PostgreSQL for Kubernetes, from high availability Postgres clusters to full-scale database-as-a-service.
https://access.crunchydata.com/documentation/postgres-operator/v5/
Apache License 2.0
3.88k stars 587 forks source link

Allow defining metadata on repoHost that does not propagate to cronjobs, etc #3556

Open Kajot-dev opened 1 year ago

Kajot-dev commented 1 year ago

Overview & why

I'm using Velero to create backups of the pgbackrest volume created by PGO. This is done by setting the annotation on the pod which uses this volume. Currently I'm only allowed to set metadata for all the pods for pgbackrest backups: cronjobs, repohosts, etc Obviously pods produced by cronjobs do not have this volume, so Velero gives me warnings about the backup failure.

Feature

Ability to set metadata separately for each repoHost through CRD.

Environment

tjmoore4 commented 1 year ago

Hello @Kajot-dev, could you please provide the exact warning you are seeing in the logs, a bit more detail as to what this implementation is solving beyond PGO's current feature set and a copy of your manifest? This will help us to better understand your use case and what feature(s) would be required for both your use case as well as similar implementations that may be helpful.

benjaminjb commented 1 year ago

Hello @Kajot-dev, we have an open issue about passing the labels/annotations from parent job/cronjob to the pods themselves: https://github.com/CrunchyData/postgres-operator/issues/3368

I'm not entirely clear if that would solve your use-case, so I'm curious about this.

Kajot-dev commented 1 year ago

Sorry, but mentioned PR won't resolve my use case. What I want is to have an annotation which is present on the PgBackrest repoHost and repoHost only (so it is explicitly NOT present on cronjobs, and pods produced by them). I need this explicitly for Velero backups where in order to backup PVC Volume, I use the following annotation on a pod (or Deploemynt, Statefulset, etc):

backup.velero.io/backup-volumes: repo1

Where repo1 is the name of the volume defined in volumes field of the PodSpec. And what happens in PGO v5.3.0 is that I can only apply this annotation for both PgBackerst repoHosts and backup Cronjobs (via PGO's CRD). So the Velero sees this annotation on repoHosts and correctly backups volume that is mounted to it, but it also sees this annotation on backup Cronjobs, and pods produced by them, so it tries to backup a volume that is not there (backup cronjobs).

So, I would like to explicitly set an annotation on repoHost and not on anything else. Hope that clarifies the situation ;)

dsessler7 commented 1 year ago

Hello @Kajot-dev , could you send your postgrescluster yaml?

Kajot-dev commented 1 year ago

Yes, of course (but I don't think it's going to solve the problem):

apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
  name: harbor-postgresql
  namespace: harbor-prod
spec:
  port: 5432
  standby:
    enabled: false
  openshift: true
  backups:
    pgbackrest:
      metadata:
        annotations:
          backup.velero.io/backup-volumes: repo1
      global:
        repo1-retention-full: "31"
        repo1-retention-full-type: time
        repo1-retention-archive: "3"
        repo1-retention-archive-type: full
        repo1-retention-diff: "3"
      repos:
        - name: repo1
          schedules:
            full: "0 0 * * 0"
            differential: "0 0 * * 3"
            incremental: "0 0 * * 1-2,4-6"
          volume:
            volumeClaimSpec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 50Gi
              storageClassname: standard-working
      restore:
        enabled: false
        repoName: repo1

  service:
    type: ClusterIP
  users:
    - name: harbor
      databases:
      - registry
    - name: postgres
  monitoring:
    pgmonitor:
      exporter:
        resources:
          requests:
            cpu: '200m'
            memory: 200M
          limits:
            cpu: '300m'
            memory: 300M
  patroni:
    leaderLeaseDurationSeconds: 30
    port: 8008
    switchover:
      enabled: true
      type: Switchover
      targetInstance: harbor-postgresql-ssd-nbkh
    syncPeriodSeconds: 10
  instances:
    - dataVolumeClaimSpec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
        storageClassName: gold
      resources:
        requests:
          cpu: '2'
          memory: 2Gi
        limits:
          cpu: '2500m'
          memory: 2500Mi
      name: gold
      replicas: 1
    - dataVolumeClaimSpec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 20Gi
        storageClassName: ssd
      resources:
        requests:
          cpu: '2'
          memory: 2Gi
        limits:
          cpu: '2500m'
          memory: 2500Mi
      name: ssd
      replicas: 1
  postgresVersion: 14

I understand that backups.bgbackrest.metadata.annotations is working as intended. This produces the following results (see the annotations): Pod created by full backup cronjob

kind: Pod
apiVersion: v1
metadata:
  generateName: harbor-postgresql-repo1-full-28006560-
  annotations:
    backup.velero.io/backup-volumes: repo1 # I DO NOT WANT IT HERE
...

Repohost definition (pod produced by StatefulSet)

kind: Pod
apiVersion: v1
metadata:
  generateName: harbor-postgresql-repo-host-
  annotations:
    backup.velero.io/backup-volumes: repo1 # I WANT THIS TO STAY HERE
...

So again I would like to define an annotation which is present ONLY on the definition of the repoHost and NOT present on the Cronjobs

dsessler7 commented 1 year ago

@Kajot-dev, I will create a story in our dev backlog for having more control over the propagation of labels/annotations in the metadata sections... In the meantime, you might try manually annotating the Pods you want to backup (kubectl annotate ...) as I don't think that annotation will be propagated when added that way.