zalando / postgres-operator

Postgres operator creates and manages PostgreSQL clusters running in Kubernetes
https://postgres-operator.readthedocs.io/
MIT License
4.29k stars 974 forks source link

Multi region setup on EKS - Question #2525

Open vipkam opened 8 months ago

vipkam commented 8 months ago

Please, answer some short questions which should help us to understand your problem / question better?

grzegorzgniadek commented 8 months ago

Hi, after some time of testing Zalando on prod and EKS in 3 regions, best option will be creating Standby cluster(https://github.com/zalando/postgres-operator/blob/master/docs/user.md#setting-up-a-standby-cluster) from S3 Bucket and then promoting that cluster to fully-working(https://github.com/zalando/postgres-operator/blob/master/docs/user.md#promote-the-standby). Very easy concept which works.

vipkam commented 7 months ago

@grzegorzgniadek Thanks for the suggestion. We moved some of our customers successfully to Europe region. But the postgres pods do not have the cronjobs at all. Will you able to put some light on this, am I making a silly mistake somewhere in configuration. I can see its failing to install crontab , here is the message its throwing

2024-02-28 16:37:43,996 - bootstrapping - INFO - Configuring crontab "-":6: bad hour

Also for debugging when I am trying to run the script manually it install only one cron.

python3 configure_spilo.py crontab 2024-03-01 19:41:53,027 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?) 2024-03-01 19:41:53,032 - bootstrapping - INFO - No meta-data available for this provider 2024-03-01 19:41:53,032 - bootstrapping - INFO - Looks like you are running unsupported 2024-03-01 19:41:53,050 - bootstrapping - INFO - Configuring crontab

crontab -l PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/15/bin /5 * bash /scripts/renice.sh

grzegorzgniadek commented 7 months ago

Can you paste your operator configuration CRD? It is only INFO type of log so nothing interesting. What else do you configure in crontab? By default and with

BACKUP_SCHEDULE: "* * * * *"

Parameter should add

crontab -l -u postgres
* * * * * envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"

to you cluster pods

vipkam commented 7 months ago

operatorconfigurations.acid.zalan.do.txt

Please find attached.

In our US region we have following crontab

crontab -l PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/15/bin /5 bash /scripts/renice.sh /5 PGDATA="/home/postgres/pgdata/pgroot/data" SSL_CERTIFICATE_FILE="/tls/tls.crt" SSL_PRIVATE_KEY_FILE="/tls/tls.key" /scripts/test_reload_ssl.sh /run/tmp 30 2 envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data" 0 1 nice -n 5 envdir "/run/etc/log.d/env" /scripts/upload_pg_log_to_s3.py

grzegorzgniadek commented 7 months ago

i mean operatorconfiguration.acid.zalan.do resource. But it seems fine so create standby cluster with source from s3 bucket data and that new cluster shoud have databases/tables from source cluster(US). Remember to set up STANDBY_ prefix for Spilo to find the backups and WAL files to stream. The same like in https://github.com/zalando/postgres-operator/blob/master/docs/administrator.md#restoring-physical-backups but another prefix.

vipkam commented 7 months ago

Please see the attachment for our operatorconfiguration.acid.zalan.do resource. operatorconfigurations.acid.zalan.do.txt

grzegorzgniadek commented 7 months ago

Still wrong resource

kubectl get operatorconfigurations.acid.zalan.do -A

You need to add

STANDBY_AWS_ACCESS_KEY_ID: AKI666666666666  # ACCESS key with access to where is source database cluster backup placed or have access to
STANDBY_AWS_REGION: us-east-2  # region of source cluster
STANDBY_AWS_S3_FORCE_PATH_STYLE: "true"
STANDBY_AWS_SECRET_ACCESS_KEY: SECRETACCESSKEYERE   # SECRET key with access to where is source database cluster backup placed or have access to
STANDBY_USE_WALG_RESTORE: "true"
STANDBY_WALG_DISABLE_S3_SSE: "true"

into secret/configmap which inject variables to postgres cluster pods then add

apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
  name: <new-postgres-cluster-name>
  namespace: <namespace>
spec:
  standby:
    s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"

to your standby(Different region cluster ) postgresqls.acid.zalan.do After that remember to clone secrets from source database cluster and apply them on standby. Be aware that https://github.com/zalando/postgres-operator/blob/master/docs/user.md#providing-credentials-of-source-cluster, standby cluster is read-only and can be operational after promoting to proper database cluster.