zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.39k stars 984 forks source link

Experience/problems running Postgres Operator on GKE with Autopilot #1638

Open RossVertizan opened 3 years ago

RossVertizan commented 3 years ago

This is a logical follow on to #274. Since that issue is closed and quite old now I thought it made sense to record my experiences/problems in a new issue - I hope this is OK. Sorry this is so long.

Initial questions

Background

We are developing a cloud-based product that needs highly available PostgreSql. I found the Postgres Operator and am blown away by what it can do for me. I am using Kubernetes on GKE, so I looked for appropriate guides to walk me through the setup process. I came across #274 which lead me to Running PostgreSQL in Google Kubernetes Engine. This is a pretty good guide and I followed it making appropriate changes as I went.

Installation process

I already had Google cloud utils installed on my machine and a working Kubernetes cluster, so I skipped these steps.

Creating a GCS bucket

Easy enough I created a GCS bucket

gsutil mb -b on -c archive -l europe-west1 -p <my-project> gs://<my-bucket-name>

Namespace

Initially I wanted to put all this in a separate namespace and I added namespace definitions to various YAML files, Later (after I had problems and backed the namespace stuff out) I found the correct way to install in a separate namespace

Create the Service Account

I applied the standard operator-service-account-rbac.yaml file:

kubectl apply -f operator-service-account-rbac.yaml

Create the pod configmap

I modified the default config map as described in the guide, so that my data section looked like this:

data:
  # Enable to turn backups on.
  USE_WALG_BACKUP: "true"
  USE_WALG_RESTORE: "true"
  WALG_GS_PREFIX: gs://<my-bucket-name>/spilo/$(SCOPE)
  CLONE_WALG_GS_PREFIX: gs://<my-bucket-name>/spilo/$(CLONE_SCOPE)
  WALE_BACKUP_THRESHOLD_PERCENTAGE: "100"

Question1: In the guide it says that "The variables SCOPE and CLONE_SCOPE will be defined in the containers where appropriate below (by spilo)", but they are not mentioned again. When I finally get a running cluster pod, I see it has a SCOPE env var, but there is no CLONE_SCOPE. How do I create this env var?

I created the pod config map:

kubectl apply -f postgres-pod-config.yaml

Completing the postgres-operator and CRD configuration

I changed the postgres-operator.yaml as described in the guide to use the CRD OperatorConfiguration object as described in the guide.

I also edited the postgresql-operator-default-configuration.yaml file.

Create the postgres operator

Finally I created the postgres operator:

kubectl apply -f postgres-operator.yaml
Warning: Autopilot increased resource requests for Deployment postgres-op/postgres-operator to meet requirements. See http://g.co/gke/autopilot-resources.
deployment.apps/postgres-operator created

The cluster I am using uses the new autopilot feature that Google offers, it looks after a number if cluster management tasks, such as scaling, security etc. It has minimum cpu, memory and ephemeral storage requirements for each pod it launches, so if there are resource requests defined in the manifest then it will override those values and impose the minimum values, hence the output shown above.

The guide suggests waiting before loading the default configuration file. I noticed that the postgres-operator was not running, so I looked at the logs for the postgres operator:

kubectl logs postgres-operator-76895fc894-j9tlq

I noticed there was an error:

time="2021-10-06T09:53:59Z" level=error msg="could not create customResourceDefinition \"operatorconfigurations.acid.zalan.do\": CustomResourceDefinition.apiextensions.k8s.io \"operatorconfigurations.acid.zalan.do\" is invalid: spec.versions[0].schema.openAPIV3
Schema: Required value: schemas are required" pkg=controller

I spent a lot of time on this, to be certain I had not messed anything up with the namespaces I backed all of the namespace modifications out and deleted the namespace. I just used the default namespace.

I also loaded the CRD definition:

kubectl apply -f operatorconfiguration.crd.yaml

This had no effect on the error message shown above, I then realised that there was a second error message which was marked as fatal rather than error:

time="2021-10-06T09:54:35Z" level=fatal msg="unable to read operator configuration: could not get operator configuration object \"postgresql-operator-default-configuration\": operatorconfigurations.acid.zalan.do \"postgresql-operator-default-configuration\" not
 found" pkg=controller

So I then went ahead and created the CRD as shown below

Create the CRD

kubectl apply -f postgresql-operator-default-configuration.yaml

After doing this the postgres-operator started up and ran without Crashing.

Launch a cluster

kubectl apply -f coverage-postgres-manifest.yaml

The cluster would not launch and I saw this error in the logs:

Warning  Create     46s   postgres-operator  could not create cluster: could not create statefulset: admission webhook "validation.gatekeeper.sh" denied the request: [denied by autogke-pod-limit-constraints] workload <acid-minimal-cluster> cpu requests <250>
is lower than the Autopilot minimum required of <500m> for using pod anti affinity. Requesting user: <system:serviceaccount:default:postgres-operator> and groups: <["system:serviceaccounts", "system:serviceaccounts:default", "system:authenticated"]>

Because the cluster would not launch I ended up having to clean up a failed cluster, if you need to do this you will need to clean up:

  • services - kubectl get services
  • endpoints - kubectl get endpoints
  • poddisruptionbudgets - kubectl get poddisruptionbudgets

To fix the problem I went back to postgresql-operator-default-configuration.yaml and set:

enable_pod_antiaffinity: false

After this change I could get a cluster to start.

Getting the cluster to work with my app

The next step was to read the password from the created secrets for the users I had created for the database, this was simple enough, but I couldn't connect to the database. This is because by default the database is created to only accept SSL connections, as defined in the pg_hba.conf file.

kubectl exec -it coverage-cluster-0 -- su postgres
cd /home/postgres/pgdata/pgroot/data
more pg_hba.conf

# Do not edit this file manually!
# It will be overwritten by Patroni!
local   all             all                                   trust
hostssl all             +zalandos    127.0.0.1/32       pam
host    all             all                127.0.0.1/32       md5
hostssl all             +zalandos    ::1/128            pam
host    all             all                ::1/128            md5
local   replication     standby                    trust
hostssl replication     standby all                md5
hostnossl all           all                all                reject
hostssl all             +zalandos    all                pam
hostssl all             all                all                md5

Postgres Documentation for pg_hba.conf

There are a number of issues already covering this topic #482, #868 and #330, I followed instructions from the latter to disable the SSL.

At this point I was able to connect to the database and things appeared to be working. However, I noticed some remaining issues.

Remaining issues

After all of the above, I essentially have a working database, but there are a couple of issues with it. What has gone before this point is really background, but there may be things that are worth commenting on or that I have done wrong. What follows are real issues that I would like assistance with, I am happy to file separate issues for each of them, if that works better.

1. No data in the backup bucket

Despite what the guide says, I am not seeing anything get saved in the bucket that I created for backup. Though as far as I am aware I followed all of the instructions.

The bucket is defined in the file postgres-pod-config.yaml

data:
  # Enable to turn backups on.
  USE_WALG_BACKUP: "true"
  USE_WALG_RESTORE: "true"
  WALG_GS_PREFIX: gs://vitaqitc-postgres-operator-backup/spilo/$(SCOPE)
  CLONE_WALG_GS_PREFIX: gs://vitaqitc-postgres-operator-backup/spilo/$(CLONE_SCOPE)
  WALE_BACKUP_THRESHOLD_PERCENTAGE: "100"

As mentioned previously, I see $SCOPE get defined in a cluster pod, but not $CLONE_SCOPE, how do I enable/create it?

Do I need to complete these sections in postgresql-operator-default-configuration.yaml, this is not mentioned in the guide, but would seem to be a logical requirement. Where do I get the GCP credentials from and where to I store them? I'm really lost with this.

aws_or_gcp:
    # additional_secret_mount: "some-secret-name"
    # additional_secret_mount_path: "/some/dir"
    aws_region: eu-central-1
    enable_ebs_gp3_migration: false
    # enable_ebs_gp3_migration_max_size: 1000
    # gcp_credentials: ""
    # kube_iam_role: ""
    # log_s3_bucket: ""
    # wal_az_storage_account: ""
    # wal_gs_bucket: ""
    # wal_s3_bucket: ""

There is also this section:

logical_backup:
    logical_backup_docker_image: "registry.opensource.zalan.do/acid/logical-backup:v1.7.0"
    # logical_backup_google_application_credentials: ""
    logical_backup_job_prefix: "logical-backup-"
    logical_backup_provider: "gcs"
    # logical_backup_s3_access_key_id: ""
    logical_backup_s3_bucket: "my-bucket-url"
    # logical_backup_s3_endpoint: ""
    # logical_backup_s3_region: ""
    # logical_backup_s3_secret_access_key: ""
    logical_backup_s3_sse: "AES256"
    logical_backup_schedule: "30 00 * * *"

What is logical backup? Do I need to complete this? I'm sure this is simple enough, but I could use some clear instructions about what to do. Thank you.

2. Cluster pod restarts with every sync

This is a topic that has many issues relating to it. #1449 #1447 #1397 #1478 #1482 and #1485 (and maybe more). This is the output I see in the logs of the postgres-operator pod:

time="2021-10-07T16:03:24Z" level=info msg="SYNC event has been queued" cluster-name=default/vitaq-coverage-cluster pkg=controller worker=1
time="2021-10-07T16:03:24Z" level=info msg="there are 1 clusters running" pkg=controller
time="2021-10-07T16:03:24Z" level=info msg="syncing of the cluster started" cluster-name=default/vitaq-coverage-cluster pkg=controller worker=1
time="2021-10-07T16:03:24Z" level=debug msg="fetching possible additional team members for team \"vitaq\"" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="team API is disabled" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=info msg="syncing secrets" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="secret default/zalando.vitaq-coverage-cluster.credentials.postgresql.acid.zalan.do already exists, fetching its password" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="secret default/dbuser.vitaq-coverage-cluster.credentials.postgresql.acid.zalan.do already exists, fetching its password" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="secret default/standby.vitaq-coverage-cluster.credentials.postgresql.acid.zalan.do already exists, fetching its password" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="secret default/postgres.vitaq-coverage-cluster.credentials.postgresql.acid.zalan.do already exists, fetching its password" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="syncing master service" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=info msg="master service default/vitaq-coverage-cluster is not in the desired state and needs to be updated" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-      protocol: TCP," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-  clusterIP: 10.13.128.90," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-  clusterIPs: [" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-    10.13.128.90" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-  ]," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-  type: ClusterIP," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="-  sessionAffinity: None" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="+  type: ClusterIP" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=info msg="reason: new service's annotations does not match the current one: Removed 'cloud.google.com/neg'." cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=info msg="master service \"default/vitaq-coverage-cluster\" is in the desired state now" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=debug msg="syncing replica service" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="No load balancer created for the replica service" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=info msg="replica service default/vitaq-coverage-cluster-repl is not in the desired state and needs to be updated" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-      protocol: TCP," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-  clusterIP: 10.13.129.209," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-  clusterIPs: [" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-    10.13.129.209" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-  ]," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-  type: ClusterIP," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="-  sessionAffinity: None" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="+  type: ClusterIP" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=info msg="reason: new service's annotations does not match the current one: Removed 'cloud.google.com/neg'." cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=info msg="replica service \"default/vitaq-coverage-cluster-repl\" is in the desired state now" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="syncing volumes using \"pvc\" storage resize mode" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=info msg="volume claims do not require changes" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:25Z" level=debug msg="syncing statefulsets" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="mark rolling update annotation for vitaq-coverage-cluster-0: reason pod changes" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="statefulset default/vitaq-coverage-cluster is not in the desired state and needs to be updated" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              cpu: 250m," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              ephemeral-storage: 1Gi," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              memory: 512Mi" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+              cpu: 1," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+              memory: 500Mi" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              cpu: 250m," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              ephemeral-storage: 1Gi," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              memory: 512Mi" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+              cpu: 100m," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+              memory: 100Mi" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-          terminationMessagePath: /dev/termination-log," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-          terminationMessagePolicy: File," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-            capabilities: {" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              drop: [" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-                NET_RAW" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-              ]" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-            }," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      restartPolicy: Always," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      dnsPolicy: ClusterFirst," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      serviceAccount: postgres-operator," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      securityContext: {" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-        seccompProfile: {" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-          type: RuntimeDefault" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-        }" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      }," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      schedulerName: default-scheduler" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+      securityContext: {}" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      kind: PersistentVolumeClaim," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      apiVersion: v1," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      status: {" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-        phase: Pending" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-      }" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+      status: {}" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-  }," cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-  revisionHistoryLimit: 10" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="+  }" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="metadata.annotation are different" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-{" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-  autopilot.gke.io/resource-adjustment: {\\input\\:{\\containers\\:[{\\limits\\:{\\cpu\\:\\1\\,\\memory\\:\\500Mi\\},\\requests\\:{\\cpu\\:\\100m\\,\\memory\\:\\100Mi\\},\\name\\:\\postgres\\}]},\\output\\:{\\contai
ners\\:[{\\limits\\:{\\cpu\\:\\250m\\,\\ephemeral-storage\\:\\1Gi\\,\\memory\\:\\512Mi\\},\\requests\\:{\\cpu\\:\\250m\\,\\ephemeral-storage\\:\\1Gi\\,\\memory\\:\\512Mi\\},\\name\\:\\postgres\\}]},\\modified\\:true}" cluster-name=default/vitaq-coverage-cluster
 pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="-}" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg=+null cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset's annotations do not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset containers's postgres (index 0) resources do not match the current ones" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset containers's postgres (index 0) security context does not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset's pod template security context in spec does not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="replacing statefulset" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=debug msg="waiting for the statefulset to be deleted" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
W1007 16:03:29.488135       1 warnings.go:70] Autopilot increased resource requests for StatefulSet default/vitaq-coverage-cluster to meet requirements. See http://g.co/gke/autopilot-resources.
time="2021-10-07T16:03:29Z" level=debug msg="making GET http request: http://10.13.1.245:8008/config" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="patching Postgres config via Patroni API on pod default/vitaq-coverage-cluster-0 with following options: {\"pg_hba\":[\"local     all  all  trust\",\"host      all  all  0.0.0.0/0   md5\",\"host      all  all  ::1/12
8     md5\"]}" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="making PATCH http request: http://10.13.1.245:8008/config" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="restarting Postgres server within pods" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="making GET http request: http://10.13.1.245:8008/patroni" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="Postgres server successfuly restarted in master pod default/vitaq-coverage-cluster-0" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="performing rolling update" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=info msg="there are 1 pods in the cluster to recreate" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="name=vitaq-coverage-cluster-0 phase=Running ip=10.13.1.245" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=warning msg="cannot perform switch over before re-creating the pod: no replicas" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=info msg="recreating old master pod \"default/vitaq-coverage-cluster-0\"" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:29Z" level=debug msg="subscribing to pod \"default/vitaq-coverage-cluster-0\"" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
E1007 16:03:56.938836       1 reflector.go:138] pkg/controller/controller.go:477: Failed to watch *v1.PostgresTeam: failed to list *v1.PostgresTeam: the server could not find the requested resource (get postgresteams.acid.zalan.do)
time="2021-10-07T16:04:33Z" level=info msg="pod \"default/vitaq-coverage-cluster-0\" has been recreated" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:33Z" level=debug msg="unsubscribing from pod \"default/vitaq-coverage-cluster-0\" events" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:33Z" level=debug msg="syncing pod disruption budgets" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="syncing roles" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="closing database connection" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="syncing databases" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="closing database connection" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="syncing prepared databases with schemas" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=debug msg="syncing pooler is not required" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:04:34Z" level=info msg="cluster has been synced" cluster-name=default/vitaq-coverage-cluster pkg=controller worker=1

I believe this is happening because the GKE autopilot is changing the settings on the instance which then leads to this:

time="2021-10-07T16:03:24Z" level=debug msg="syncing master service" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:24Z" level=info msg="master service default/vitaq-coverage-cluster is not in the desired state and needs to be updated" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1

the way to get around this would be set everything to be the same as autopilot will set, so there is a match. I have implemented this in the operatorconfiguration.crd.yaml file and I have re-applied it (is this all I need to do to have it re-read?).

The problem with this is that autopilot is setting a value for ephemeral storage, but there is no way for me to set this in the operatorconfiguration.crd.yaml file. Any thoughts?

3. Failed to watch *v1.PostgresTeam

I see many lines in the log of the postgres-operator pod kubectl logs postgres-operator-76895fc894-j9tlq that say:

E1007 20:17:16.287174       1 reflector.go:138] pkg/controller/controller.go:477: Failed to watch *v1.PostgresTeam: failed to list *v1.PostgresTeam: the server could not find the requested resource (get postgresteams.acid.zalan.do)

What do I need to do to fix this. Note that I changed the team name from acid, do I need to define a team somewhere? Do I need to load the PostgresTeam CRD ?

4. Postgres is down

Since I've been writing this, I've just noticed that when I run logs on the cluster instance kubectl logs vitaq-coverage-cluster-0 I get lots of blocks of output like this:

2021-10-07 20:27:14,189 INFO: Lock owner: None; I am vitaq-coverage-cluster-0
2021-10-07 20:27:14,190 INFO: starting as a secondary
2021-10-07 20:27:15 UTC [13268]: [1-1] 615f5823.33d4 0     FATAL:  could not write lock file "postmaster.pid": No space left on device
2021-10-07 20:27:15,297 INFO: postmaster pid=13268
/var/run/postgresql:5432 - no response
2021-10-07 20:27:15,391 WARNING: Postgresql is not running.
2021-10-07 20:27:15,391 INFO: Lock owner: None; I am vitaq-coverage-cluster-0
2021-10-07 20:27:15,397 INFO: pg_controldata:
  pg_control version number: 1300
  Catalog version number: 202007201
  Database system identifier: 7016234416326516818
  Database cluster state: in archive recovery
  pg_control last modified: Thu Oct  7 19:07:05 2021
  Latest checkpoint location: 0/260001B8
  Latest checkpoint's REDO location: 0/26000180
  Latest checkpoint's REDO WAL file: 000000140000000000000026
  Latest checkpoint's TimeLineID: 20
  Latest checkpoint's PrevTimeLineID: 20
  Latest checkpoint's full_page_writes: on
  Latest checkpoint's NextXID: 0:8719
  Latest checkpoint's NextOID: 17775
  Latest checkpoint's NextMultiXactId: 1
  Latest checkpoint's NextMultiOffset: 0
  Latest checkpoint's oldestXID: 479
  Latest checkpoint's oldestXID's DB: 1
  Latest checkpoint's oldestActiveXID: 8719
  Latest checkpoint's oldestMultiXid: 1
  Latest checkpoint's oldestMulti's DB: 1
  Latest checkpoint's oldestCommitTsXid: 0
  Latest checkpoint's newestCommitTsXid: 0
  Time of latest checkpoint: Thu Oct  7 17:34:33 2021
  Fake LSN counter for unlogged rels: 0/3E8
  Minimum recovery ending location: 0/27000000
  Min recovery ending loc's timeline: 20
  Backup start location: 0/0
  Backup end location: 0/0
  End-of-backup record required: no
  wal_level setting: replica
  wal_log_hints setting: on
  max_connections setting: 100
  max_worker_processes setting: 8
  max_wal_senders setting: 10
  max_prepared_xacts setting: 0
  max_locks_per_xact setting: 64
  track_commit_timestamp setting: off
  Maximum data alignment: 8
  Database block size: 8192
  Blocks per segment of large relation: 131072
  WAL block size: 8192
  Bytes per WAL segment: 16777216
  Maximum length of identifiers: 64
  Maximum columns in an index: 32
  Maximum size of a TOAST chunk: 1996
  Size of a large-object chunk: 2048
  Date/time type storage: 64-bit integers
  Float8 argument passing: by value
  Data page checksum version: 0
  Mock authentication nonce: c89627215aa78da3402172a689e5250d0e9532128656735d9502a278033e0640

What has happened? Which device is full? Is this related to some of the earlier issues?

Any help would be great.

RossVertizan commented 3 years ago

Update

I identified four issues above:

  1. No data in the backup bucket
  2. Cluster pod restarts with every sync
  3. Failed to watch *v1.PostgresTeam
  4. Postgres is down

Since number 4 seemed to be the most serious I decided to look at that first. From the logs above the obvious cause is:

FATAL:  could not write lock file "postmaster.pid": No space left on device

So, lets find where the disk space is full.

kubectl exec --stdin --tty vitaq-coverage-cluster-0 -- /bin/bash
>df -h
root@vitaq-coverage-cluster-0:/home/postgres# df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          95G   18G   77G  19% /
tmpfs            64M     0   64M   0% /dev
tmpfs           7.9G     0  7.9G   0% /sys/fs/cgroup
/dev/sda1        95G   18G   77G  19% /etc/hosts
tmpfs           7.9G     0  7.9G   0% /dev/shm
/dev/sdc        976M  960M     0 100% /home/postgres/pgdata
tmpfs           7.9G   12K  7.9G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           7.9G     0  7.9G   0% /proc/acpi
tmpfs           7.9G     0  7.9G   0% /proc/scsi
tmpfs           7.9G     0  7.9G   0% /sys/firmware

>cd /home/postgres/pgdata/pgroot/data
>du -k
4       ./pg_commit_ts
4       ./pg_dynshmem
4       ./pg_stat_tmp
4       ./pg_logical/mappings
4       ./pg_logical/snapshots
16      ./pg_logical
4       ./pg_serial
12      ./pg_xact
4       ./pg_wal/archive_status
933984  ./pg_wal
12      ./pg_multixact/offsets
12      ./pg_multixact/members
28      ./pg_multixact
4       ./pg_notify
4       ./pg_twophase
9380    ./base/16839
7732    ./base/13384
8836    ./base/1
10464   ./base/13385
36416   ./base
48      ./pg_subtrans
576     ./global
4       ./pg_snapshots
4       ./pg_tblspc
16      ./pg_stat
4       ./pg_replslot
971240  .

It is obvious from this and from looking in the pg_wal directory that the container disk space has filled up because wal can't write the backups to the Google Cloud Storage, so I need to get that fixed to solve this problem (I hope).

To work this through, I went to the administrator section of the Zalando Postgres operator docs and then to the section "Google Cloud Platform Setup".

The first thing is to create a GCP Service Account on the Google Cloud Platform web page. Navigate to IAM and Admin -> Service Accounts. Create the Service account by following the steps and give it permissions to Google Cloud Storage. When it is created, create keys and download the JSON file.

Now create the secret as described in the documentation, my file ended up looking like this:

apiVersion: v1
kind: Secret
metadata:
  name: psql-wal-creds
  namespace: default
type: Opaque
stringData:
  key.json: |-
    { ... }

Where I have left the { ... } I pasted the entire contents of the file downloaded from Google.

Now update the aws_or_gcp section in the postgresql-operator-default-configuration.yaml. Mine ended up looking like this:

aws_or_gcp:
    additional_secret_mount: "psql-wal-creds"
    additional_secret_mount_path: "/var/secrets/google"
    gcp_credentials: "/var/secrets/google/key.json"
    wal_gs_bucket: "postgres-operator-backup"
    # aws_region: eu-central-1
    # enable_ebs_gp3_migration: false
    # enable_ebs_gp3_migration_max_size: 1000
    # kube_iam_role: ""
    # log_s3_bucket: ""
    # wal_az_storage_account: ""
    # wal_s3_bucket: ""

I also modified the postgres-pod-config.yaml file to add the CLONE_USE_WALG_RESTORE key which was not mentioned in the guide I followed initially. The data section of my file now looks like this:

data:
  # Enable to turn backups on.
  USE_WALG_BACKUP: "true"
  USE_WALG_RESTORE: "true"
  CLONE_USE_WALG_RESTORE: "true"
  WALG_GS_PREFIX: gs://postgres-operator-backup/spilo/$(SCOPE)
  CLONE_WALG_GS_PREFIX: gs://postgres-operator-backup/spilo/$(CLONE_SCOPE)
  WALE_BACKUP_THRESHOLD_PERCENTAGE: "100"

Now it's time to deploy all of these changes:

kubectl apply -f zalando-postgres-operator\psql-wal-creds.yaml
kubectl describe secret psql-wal-creds
kubectl apply -f zalando-postgres-operator\postgresql-operator-default-configuration.yaml
kubectl apply -f zalando-postgres-operator\postgres-pod-config.yaml
kubectl rollout restart deployment postgres-operator
kubectl rollout status -w deployment postgres-operator

and when I had done this I now see that data is being written to the GCS bucket (navigate to Google Cloud Platform -> Cloud Storage -> Browser and select the bucket)

So that's issues 1 and 4 (hopefully) dealt with.

RossVertizan commented 3 years ago

Update

I noticed this output in the logs of the cluster instance pod:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/__init__.py", line 821, in get_replica_timeline
    with self.get_replication_connection_cursor(**self.config.local_replication_address) as cur:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/__init__.py", line 816, in get_replication_connection_cursor
    with get_connection_cursor(**conn_kwargs) as cur:
  File "/usr/lib/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.6/dist-packages/patroni/postgresql/connection.py", line 43, in get_connection_cursor
    conn = psycopg2.connect(**kwargs)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 127, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL:  no pg_hba.conf entry for replication connection from host "::1", user "standby", SSL on
FATAL:  no pg_hba.conf entry for replication connection from host "::1", user "standby", SSL off

This clearly relates to the permissions that I previously changed in the manifest.yaml. To fix this I combined the values that I had with the default values to cover all bases (though I am not certain why I need to do this, as I thought the sttings I had would cover all cases). Now my manifest.yaml contains this:

patroni:
    pg_hba:
      - local     all             all                                   trust
      - host      all             all             0.0.0.0/0             md5
      - host      all             all             ::1/128               md5
      - hostssl   all             +zalandos       127.0.0.1/32          pam
      - host      all             all             127.0.0.1/32          md5
      - hostssl   all             +zalandos       ::1/128               pam
      - local     replication     standby                               trust
      - hostssl   replication     standby         all                   md5
      - hostssl   all             +zalandos       all                   pam
      - hostssl   all             all             all                   md5

After making this change I have not seen these errors again.

RossVertizan commented 3 years ago

Another update

  1. Failed to watch *v1.PostgresTeam One of the four issues I initially raised was I was seeing a lot of errors that say:
    E1007 20:17:16.287174       1 reflector.go:138] pkg/controller/controller.go:477: Failed to watch *v1.PostgresTeam: failed to list *v1.PostgresTeam: the server could not find the requested resource (get postgresteams.acid.zalan.do)

I simply applied the default postgresteam.crd.yaml file and the messages went away.

kubectl apply -f postgresteam.crd.yaml

So, this issue is fixed. The CRD defines additional super user teams, additional teams and additional members, which are documented here.

RossVertizan commented 3 years ago

Yet another update

So, the only outstanding issue is that the cluster pod re-starts every 30 minutes.

Looking at the log file for the postgres-operator pod I was able to see that actually, the master, the replica and the statefulset were all getting updated every sync interval (30 minutes). For the master and the replica there was a line stating the reason:

time="2021-10-07T16:03:25Z" level=info msg="reason: new service's annotations does not match the current one: Removed 'cloud.google.com/neg'." cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1

This means that the proposed new mater/replica is missing the cloud.google.com/neg annotation. By describeing the running master I was able to see what the value of that annotation was. I was then able to add this to the custom_service_annotations section of the postgresql-operator-default-configuration.yaml file. Which now looks like this:

load_balancer:
    custom_service_annotations:
      cloud.google.com/neg: '{"ingress":true}'

My understanding is that postgres-operator applies these custom annotations to future instances and now the new and existing instances match and the master/replica will not get updated.

Applying this change confirms this understanding and I don't see the master/replica get updated.

However

The statefulset is still getting updated. When this updates it causes it's pods to re-start, so in fact I am still seeing the master/replica pods getting re-created at every sync.

The statefulset also shows reasons for it updating:

time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset's annotations do not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset containers's postgres (index 0) resources do not match the current ones" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset containers's postgres (index 0) security context does not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1
time="2021-10-07T16:03:26Z" level=info msg="reason: new statefulset's pod template security context in spec does not match the current one" cluster-name=default/vitaq-coverage-cluster pkg=cluster worker=1

So far I have been unable to find a way around this issue. Any comments/suggestions gratefully received.

RossVertizan commented 3 years ago

Update

I have now fixed this last issue and have postgres-operator running on a GKE Autopilot cluster.

The fix for the updating was very messy/hacky and I will open another issue (created issue #1646) for that for it to be looked into to make a cleaner/more generic solution for modified settings.

The only error I see now in the postgres-operator logs is:

time="2021-10-13T11:59:01Z" level=error msg="could not create customResourceDefinition \"operatorconfigurations.acid.zalan.do\": CustomResourceDefinition.apiextensions.k8s.io \"operatorconfigurations.acid.zalan.do\" is invalid: spec.versions[0].schema.openAPIV3Schema: Required value: schemas are required" pkg=controller

which is an error I saw right at the start. It doesn't appear to harm anything, but it would be good to understand and clear up.

This is now the only remaining issue for me and the only reason I am leaving this issue open.

Any comments are gratefully received.

mathe-matician commented 2 years ago

@RossVertizan Did you ever run into Config not found. when trying to run on GKE? I continually get this error from the database pods, but only when I specify gcp options in the aws_or_gcp configuration:

aws_or_gcp:
  additional_secret_mount: "credentials"
  additional_secret_mount_path: "/home/postgres/"
  wal_gs_bucket: "my-bucket"
  gcp_credentials: "/home/postgres/credentials.json"

Because the pods fail with this Config not found. error, Patroni is never initialized.

Without changing any other configuration, if I comment this section out or use AWS as the cloud provider, the cluster has no problem spinning up.

RossVertizan commented 2 years ago

@mathe-matician as far as I recall I never saw that problem, I can't find any reference to that in my notes and I did this some time ago so I'm a little rusty on the exact details. However, I think the values you have specified may not be quite right, but please forgive me if they are and I just don't understand your setup.

In my case I have a file called psql-wal-creds.yaml that ended up looking like this:

apiVersion: v1
kind: Secret
metadata:
  name: psql-wal-creds
  namespace: default
type: Opaque
stringData:
  key.json: |-
    { ... }

Note that the stringData defines key.json. When I apply this on the K8S cluster it generates a secret. To be honest, I don't know why it gets mounted on /var/secrets/google but it does (default location for secrets ?). What actually gets mounted is a file by the name of the stringData. So, if I log into one of my cluster pods:

kubectl exec --stdin --tty vitaq-coverage-cluster-0 -- /bin/bash

and then:

cd /var/secrets/google
ls -al
more key.json

I can see a file called key.json (name of the stringData) and it contains the value of the stringData. This is then reflected in the configuration file as:

aws_or_gcp:
    additional_secret_mount: "psql-wal-creds"
    additional_secret_mount_path: "/var/secrets/google"
    gcp_credentials: "/var/secrets/google/key.json"
    wal_gs_bucket: "postgres-operator-backup"

Where: additional_secret_mount - is the name of the secret from psql-wal-creds.yaml -> metadata -> name additional_secret_mount_path - is the magic mount path (don't know where this is defined) gcp_credentials - is the full path to the file with the credentials, where the key.json comes from the psql-wal-creds.yaml stringData section.

So, applying this to your configuration:

aws_or_gcp:
  additional_secret_mount: "credentials"
  additional_secret_mount_path: "/home/postgres/"
  wal_gs_bucket: "my-bucket"
  gcp_credentials: "/home/postgres/credentials.json"

This would mean that:

I hope that the above is a help to you and that walking through the settings allows you to identify an inconsistency. If not and I'm teaching you to suck eggs, I do apologise. Please let me know how you get on and if I can help further, please don't hesitate to ask.

mathe-matician commented 2 years ago

@RossVertizan I really appreciate the quick response. Walking through the steps was a helpful sanity check for me, so thank you. I did have the aws_or_gcp section set up correctly based on your definitions.

There are two differences in my setup, though:

  1. I base64 encoded the value of the json file in my Secret
  2. And as you point out, my mount path is different.

The solution was changing the mount path to /var/secrets/google:

aws_or_gcp:
  additional_secret_mount: "credentials"
  additional_secret_mount_path: "/var/secrets/google"
  wal_gs_bucket: "my-bucket"
  gcp_credentials: "/var/secrets/google/credentials.json"
RossVertizan commented 2 years ago

@mathe-matician excellent. Good job. Thanks for letting me know.

bjolivot commented 1 year ago

No sure why but I had to add GOOGLE_APPLICATION_CREDENTIALS in the pod-env-override.yaml to make it work Thanks @RossVertizan for your post, helped me a lot :)

apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-env-overrides
  namespace: mynamespace
data:
  # Any env variable used by spilo can be added
  USE_WALG_BACKUP: "true"
  USE_WALG_RESTORE: "true"
  CLONE_USE_WALG_RESTORE: "true"
  GOOGLE_APPLICATION_CREDENTIALS: "/var/secrets/google/key.json"
chrisalavoinehmm commented 1 year ago

@RossVertizan - great work here. I've been following your notes which have proved very useful, so thanks for that. I've managed to get postgres-operator up and running along with the UI. I'm using GKE with Autopilot on a brand new cluster, I've created a separate namespace and have got everything running in there. I'm hitting a roadblock when I try to create a Postgres Cluster (either with kubectl YAML or the UI). It always gets stuck on StatefulSet pending with the following in the UI logs:

---Logging error --- Traceback (most recent call last): File "/usr/lib/python3.9/logging/handlers.py", line 978, in emit self.socket.send(msg) File "/usr/lib/python3.9/site-packages/gevent/_socketcommon.py", line 722, in send return self._sock.send(data, flags) File "/usr/lib/python3.9/site-packages/gevent/_socket3.py", line 55, in _dummy raise OSError(EBADF, 'Bad file descriptor') OSError: [Errno 9] Bad file descriptor

Have you come across this before? I've tried re-creating with the default namespace and I get the same error. I've also tried an earlier version of the docker image (the one you quoted in your tests) to no avail.

edit - removed most of those logs as the default settings on the manifest seem to at least get things running - I had fiddled too much with things before creating mine, thus the Python errors. Apologies for the spam.

chrisalavoinehmm commented 1 year ago

I did find a similar issue here: #2302 The problem seems to be with 1.10, however I can't get it to work with any version. I'm guessing I'm doing some wrong on the setup steps. I'll retrace and retry.

chrisalavoinehmm commented 1 year ago

Just to update, I have my PO setup working (up to a point). I can create clusters using either the UI or manually with a manifest. Still can't get my WAL GCS buckets to work and I'm just working through the Autopilot config to stop the cluster restarts. This has been an invaluable resource though, so thank you.