Open vipkam opened 8 months ago
Hi, after some time of testing Zalando on prod and EKS in 3 regions, best option will be creating Standby cluster(https://github.com/zalando/postgres-operator/blob/master/docs/user.md#setting-up-a-standby-cluster) from S3 Bucket and then promoting that cluster to fully-working(https://github.com/zalando/postgres-operator/blob/master/docs/user.md#promote-the-standby). Very easy concept which works.
@grzegorzgniadek Thanks for the suggestion. We moved some of our customers successfully to Europe region. But the postgres pods do not have the cronjobs at all. Will you able to put some light on this, am I making a silly mistake somewhere in configuration. I can see its failing to install crontab , here is the message its throwing
2024-02-28 16:37:43,996 - bootstrapping - INFO - Configuring crontab "-":6: bad hour
Also for debugging when I am trying to run the script manually it install only one cron.
python3 configure_spilo.py crontab 2024-03-01 19:41:53,027 - bootstrapping - INFO - Figuring out my environment (Google? AWS? Openstack? Local?) 2024-03-01 19:41:53,032 - bootstrapping - INFO - No meta-data available for this provider 2024-03-01 19:41:53,032 - bootstrapping - INFO - Looks like you are running unsupported 2024-03-01 19:41:53,050 - bootstrapping - INFO - Configuring crontab
crontab -l PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/15/bin /5 * bash /scripts/renice.sh
Can you paste your operator configuration CRD? It is only INFO type of log so nothing interesting. What else do you configure in crontab? By default and with
BACKUP_SCHEDULE: "* * * * *"
Parameter should add
crontab -l -u postgres
* * * * * envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data"
to you cluster pods
operatorconfigurations.acid.zalan.do.txt
Please find attached.
In our US region we have following crontab
crontab -l PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/lib/postgresql/15/bin /5 bash /scripts/renice.sh /5 PGDATA="/home/postgres/pgdata/pgroot/data" SSL_CERTIFICATE_FILE="/tls/tls.crt" SSL_PRIVATE_KEY_FILE="/tls/tls.key" /scripts/test_reload_ssl.sh /run/tmp 30 2 envdir "/run/etc/wal-e.d/env" /scripts/postgres_backup.sh "/home/postgres/pgdata/pgroot/data" 0 1 nice -n 5 envdir "/run/etc/log.d/env" /scripts/upload_pg_log_to_s3.py
i mean operatorconfiguration.acid.zalan.do resource. But it seems fine so create standby cluster with source from s3 bucket data and that new cluster shoud have databases/tables from source cluster(US). Remember to set up STANDBY_ prefix for Spilo to find the backups and WAL files to stream. The same like in https://github.com/zalando/postgres-operator/blob/master/docs/administrator.md#restoring-physical-backups but another prefix.
Please see the attachment for our operatorconfiguration.acid.zalan.do resource. operatorconfigurations.acid.zalan.do.txt
Still wrong resource
kubectl get operatorconfigurations.acid.zalan.do -A
You need to add
STANDBY_AWS_ACCESS_KEY_ID: AKI666666666666 # ACCESS key with access to where is source database cluster backup placed or have access to
STANDBY_AWS_REGION: us-east-2 # region of source cluster
STANDBY_AWS_S3_FORCE_PATH_STYLE: "true"
STANDBY_AWS_SECRET_ACCESS_KEY: SECRETACCESSKEYERE # SECRET key with access to where is source database cluster backup placed or have access to
STANDBY_USE_WALG_RESTORE: "true"
STANDBY_WALG_DISABLE_S3_SSE: "true"
into secret/configmap which inject variables to postgres cluster pods then add
apiVersion: acid.zalan.do/v1
kind: postgresql
metadata:
name: <new-postgres-cluster-name>
namespace: <namespace>
spec:
standby:
s3_wal_path: "s3://<bucketname>/spilo/<source_db_cluster>/<UID>/wal/<PGVERSION>"
to your standby(Different region cluster ) postgresqls.acid.zalan.do After that remember to clone secrets from source database cluster and apply them on standby. Be aware that https://github.com/zalando/postgres-operator/blob/master/docs/user.md#providing-credentials-of-source-cluster, standby cluster is read-only and can be operational after promoting to proper database cluster.
Please, answer some short questions which should help us to understand your problem / question better?