Closed aurelienmarie closed 6 months ago
There is currently a PR open to add TLS support for database pods. At the moment, the TLS volume is not passed to the logical backup pod. Would this be useful for you?
@FxKu my issue is not related to the TLS support for database pods but for the the backups cronjobs. It's required for wal-e/wal-g in order to target the S3 endpoint.
I also had this problem today. This happens when using a S3 server that does not have a valid signed certificate.
It would be nice if the operator could accept something like a base64 encoded PEM certificate, this would then be passed to the Logical Backup container as a environment variable where dump.sh
could base64 -d
the content and save that to a mktmp
file or similar.
Appending --ca-bundle=$CA_FILE"
to the aws cli (or setting the AWS_CA_BUNDLE
environment variable) so verification is in place instead of using --insecure
(or allowing for both options)
AWS_CA_BUNDLE Specifies the path to a certificate bundle to use for HTTPS certificate validation.
If defined, this environment variable overrides the value for the profile setting ca_bundle. You can override this environment variable by using the --ca-bundle command line parameter.
You can workaround this by using letsencrypt and putting a valid certificate on your minio deployment. You don't even need to make your endpoint public now that most dns providers support api based cname validation! That seems like a 'better' solution than supporting the --insecure
on their own implementation.
Supporting custom TLS or root CAs on the operator side is also a good feature to add though, but then you still have to go through the process of signing, deploying and managing a cert for your minio installation.
Is it possible to disable ssl altogether for minio logical backups? I can't find the option in documentation.
Is there any update? We are also facing the issue, that backing up to a remote minio instance fails because of self signed certificates. We do not control the minio, so it it is not that easy to change the certificates. I have are some workarounds in my mind (like adding a cluster-internal reverse proxy), but they would be some kind of hacky.
@FxKu we are really looking --insecure
or --tls-no-verify
kind of flags for logical backup pods, since there are no way to add CA cert to the pod. Please consider as needed feature.
You can use additionalVolumes
to add the CA certificate to the database pod and set WALG_S3_CA_CERT_FILE
pointing to the mounted file.
additionalVolumes:
- name: minio-ca-certificate
mountPath: /certs/minio
targetContainers: []
volumeSource:
secret:
secretName: minio-ca-certificate
env:
- name: WALG_S3_CA_CERT_FILE
value: "/certs/minio/ca.crt"
+1
You can use
additionalVolumes
to add the CA certificate to the database pod and setWALG_S3_CA_CERT_FILE
pointing to the mounted file.additionalVolumes: - name: minio-ca-certificate mountPath: /certs/minio targetContainers: [] volumeSource: secret: secretName: minio-ca-certificate
env: - name: WALG_S3_CA_CERT_FILE value: "/certs/minio/ca.crt"
I followed the same steps but still failed to upload backups to minio s3 , can i get some help here
k get pods -n zalando NAME READY STATUS RESTARTS AGE abc-time-0 1/1 Running 0 43h abc-time-1 1/1 Running 0 43h abc-time-2 1/1 Running 0 43h logical-backup-abc-time-27964140--1-8x5qm 0/1 Error 0 32m logical-backup-abc-time-27964140--1-hf4cf 0/1 Error 0 23m logical-backup-abc-time-27964140--1-ncfkg 0/1 Error 0 33m logical-backup-abc-time-27964140--1-nwfhw 0/1 Error 0 32m logical-backup-abc-time-27964140--1-t5rd7 0/1 Error 0 31m logical-backup-abc-time-27964140--1-vrc5j 0/1 Error 0 29m logical-backup-abc-time-27964140--1-xszth 0/1 Error 0 17m postgres-operator-7486f85b89-qndjv 1/1 Running 0 6d17h k logs logical-backup-abc-time-27964140--1-xszth -n zalando IPv4 API Endpoint: https://x.x.x.x:443/api/v1 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 14170 0 14170 0 0 1383k 0 --:--:-- --:--:-- --:--:-- 1383k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 167 100 167 0 0 15181 0 --:--:-- --:--:-- --:--:-- 15181 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 167 100 167 0 0 20875 0 --:--:-- --:--:-- --:--:-- 20875 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 19762 0 19762 0 0 2412k 0 --:--:-- --:--:-- --:--:-- 2412k
but the operator keeps on recreating the pods due to difference in the pod spec(additional volume in postgres )
running into the exact same issue. Would be nice if a custom ca could be mounted
@FxKu : Can you guys fix this as a priority?
If your CA used for the Postgress tls certificate is the same as your S3 backend's CA then you can reuse the ca.crt mounted into the postgres container. For example as environment variable:
env:
- name: WALG_S3_CA_CERT_FILE
value: "/tls/ca.crt"
or add to the ConfigMap for the Postgres Operator if using the pod_environment_configmap
in the Helm chart:
WALG_S3_CA_CERT_FILE: "/tls/ca.crt"
I am stuck on this topic for 2 days after following all the suggestions here, can you please help?
I brought up minio with: helm install minio oci://registry-1.docker.io/bitnamicharts/minio --set tls.enabled=true --set tls.autoGenerated=true
I created a postgresql cluster with the following:
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
name: test41
spec:
env:
- name: AWS_ACCESS_KEY_ID
value: admin
- name: AWS_ENDPOINT
value: https://10.9.20.45:31734
- name: AWS_REGION
value: us-east-1
- name: AWS_S3_FORCE_PATH_STYLE
value: "true"
- name: AWS_SECRET_ACCESS_KEY
value: jK501Iv3tt
- name: BACKUP_NUM_TO_RETAIN
value: "5"
- name: BACKUP_SCHEDULE
value: 00 10 * * *
- name: CLONE_USE_WALG_RESTORE
value: "true"
- name: USE_WALG_BACKUP
value: "true"
- name: USE_WALG_RESTORE
value: "true"
- name: WAL_BUCKET_SCOPE_PREFIX
value: ""
- name: WAL_BUCKET_SCOPE_SUFFIX
value: ""
- name: WAL_S3_BUCKET
value: test40
- name: WALG_DISABLE_S3_SSE
value: "true"
- name: WALG_S3_CA_CERT_FILE
value: "/certs/minio/ca.crt"
- name: WALE_LOG_DESTINATION
value: "/tmp/walg.log"
teamId: "test41"
additionalVolumes:
- name: minio-tls
mountPath: /certs/minio
targetContainers: []
volumeSource:
secret:
secretName: minio-crt
volume:
size: 1Gi
numberOfInstances: 1
users:
zalando: # database owner
- superuser
- createdb
foo_user: [] # role for application foo
databases:
foo: zalando # dbname: owner
preparedDatabases:
bar: {}
postgresql:
version: "15"
parameters:
huge_pages: "off"
wal-g is getting stuck
root@test41-0:/run/etc/wal-e.d/env# echo $WALG_S3_CA_CERT_FILE
/certs/minio/ca.crt
root@test41-0:/run/etc/wal-e.d/env# envdir "/run/etc/wal-e.d/env" wal-g backup-list
^C
root@test41-0:/run/etc/wal-e.d/env# ps -aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2636 500 ? Ss 16:43 0:00 /usr/bin/dumb-init -c --rewrite 1:0 -- /bin/sh /launch.sh
root 6 0.0 0.0 2884 784 ? S 16:43 0:00 /bin/sh /launch.sh
root 24 0.0 0.0 2804 508 ? S 16:43 0:00 /usr/bin/runsvdir -P /etc/service
root 30 0.0 0.0 2652 516 ? Ss 16:43 0:00 runsv cron
root 31 0.0 0.0 2652 512 ? Ss 16:43 0:00 runsv patroni
root 32 0.0 0.0 2652 512 ? Ss 16:43 0:00 runsv pgqd
postgres 33 0.0 0.0 490716 29504 ? Sl 16:43 0:04 /usr/bin/python3 /usr/local/bin/patroni /home/postgres/postgres.yml
root 34 0.0 0.0 3868 1148 ? S 16:43 0:00 /usr/sbin/cron -f
postgres 35 0.0 0.0 17040 4220 ? S 16:43 0:00 /usr/bin/pgqd /home/postgres/pgq_ticker.ini
postgres 117 0.0 0.0 194176 24616 ? S 16:43 0:00 /usr/lib/postgresql/15/bin/postgres -D /home/postgres/pgdata/pgroot/data --config-file=/home/postgres/pgdata/pgroot/data/po
postgres 119 0.0 0.0 75168 3672 ? Ss 16:43 0:00 postgres: test41: logger
postgres 120 0.0 0.0 194308 12684 ? Ss 16:43 0:00 postgres: test41: checkpointer
postgres 121 0.0 0.0 194316 4540 ? Ss 16:43 0:00 postgres: test41: background writer
postgres 123 0.2 0.0 296608 16568 ? Ssl 16:43 0:21 postgres: test41: bg_mon
postgres 126 0.0 0.0 194176 6912 ? Ss 16:43 0:00 postgres: test41: walwriter
postgres 127 0.0 0.0 195788 4772 ? Ss 16:43 0:00 postgres: test41: autovacuum launcher
postgres 128 0.0 0.0 194284 4016 ? Ss 16:43 0:00 postgres: test41: archiver archiving 000000010000000000000001
postgres 129 0.0 0.0 196756 7396 ? Ss 16:43 0:00 postgres: test41: pg_cron launcher
postgres 130 0.0 0.0 195764 4520 ? Ss 16:43 0:00 postgres: test41: logical replication launcher
postgres 135 0.0 0.0 197500 12096 ? Ss 16:43 0:00 postgres: test41: postgres postgres [local] idle
root 187 0.0 0.0 8156 2856 pts/0 Ss 16:43 0:00 bash
postgres 1269 0.0 0.0 2872 504 ? S 18:31 0:00 sh -c envdir "/run/etc/wal-e.d/env" wal-g wal-push "pg_wal/000000010000000000000001"
postgres 1270 0.3 0.0 1637912 34912 ? Sl 18:31 0:02 wal-g wal-push pg_wal/000000010000000000000001
root 1442 0.0 0.0 10456 1716 pts/0 R+ 18:44 0:00 ps -aux
I dont see the same issue with minio tls disabled.
I figured this out, the problem is with which IPs allowed with ca.crt. Bitnami minio helm chart auto-gen certs allows connections only to the k8s service ip and I was using host ip with nodeport which wont work. I changed AWS_ENDPOINT from https://10.9.20.45:31734 to https://minio.default.svc.cluster.local:9000 and wal-g is not stuck anymore, things are flowing.
I think the issue is not resolved. The problem here is that the S3 SSL cert is a self-signed cert. Hence the backup job pods don't have S3' CA cert. Hence it will fail to run backup job.
adding a CA volume as removed by postgres-operator due to difference it spec. Hence that solution is also doesn't work.
But @FxKu has closed the issue now. any reason why?
I am currently testing the operator and running Minio for S3 storage. I managed to do the logical backup with TLS disabled on Minio and it works fine.
However, i have this issue when i enable the TLS on Minio.
I've build my own docker image for backups adding
args+=("--no-verify-ssl")
in thedump.sh
. As a result, i was able to backup my db with TLS enable on Minio.The idea here is to add a parameter on the operatorconfiguration in order to either be able to specify a certificate, or disable the ssl verification.
Zalando postgres-operator version : v1.3.1