Open ranchodeluxe opened 1 week ago
Well that was anti-climatic π
The install
executable's default error when it hits permission issues give us the dreaded "No such file or directory"
which is VERY misleading
the postgres-startup
pod's fsGroup
pod security setting doesn't help us here b/c of https://github.com/kubernetes/examples/issues/260 since the NFS mount is mounted as root
π
i see I'm reliving past debugging with this issue. Is there no work around offered by PGO for this? Meaning, is there nowhere in our values to plumb through something that can set gid
for that NFS mount?
Hi @ranchodeluxe, have you tried setting PostgresCluster.spec.supplementalGroups
? You can find this setting in the following section of the API reference.
As described in the docs, this setting is often used to access shared file systems, such as NFS:
A list of group IDs applied to the process of a container. These can be useful when accessing shared file systems with constrained permissions. More info: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context
Thanks @andrewlecuyer. Yeah, saw that but if the NFS is mounted as root it doesn't really help b/c we can't use 0
:
spec.supplementalGroups[0] in body should be greater than or equal to 1
I'm not seeing options but they probably exist for me to tell PGO where the PGDATA
dir should be? It seems spec.dataSource
is used to only create jobs for "moving" data. If I can set PGDATA
then I can spin up a pod that mounts the NFS and bootstraps a subdir with the proper perms for USER 26
like I do on other projects that do not use PGO
I'm not seeing options but they probably exist for me to tell PGO where the PGDATA dir should be?
Given that I know the containers are by default mounting the NFS PVC to /pgdata
and looking for dirs pg16
and pg16_wal
I assumed I could create an existing PVC and pod where I've bootstrapped those dirs to have the right perms and then force my instances to use that PVC via dataVolumeClaimSpec.dataSource
but even that seems not to work as it tries to create a new claim
As @andrewlecuyer suggested, on AWS EFS the only solution right now (which isn't a solution for me for reasons I will mention later) is to use EFS Access Points and give them an explicit uid: 26
and gid: 26
and then everything works.
Unfortunately, I'm trying to provision PGO into an RKE cluster that has been set up for me and where I don't have control over which uid:gid
the NFS mounts have.
If anyone has a work around above let me know. dataVolumeClaimSpec.dataSource
should be the easiest way to make this happen but I feel like there's a bug there I will grok later on when I have time
@ranchodeluxe are in an effort to reproduce/better-understand, can you provide a copy (e.g. via kubectl get sc -o yaml
) of the exact storage class you are using in your RKE environment? I am especially curious about any settings/parameters for uid
and/or gid
(e.g. I'm assuming parameter such as uid
and gid
simply aren't set with the storage class you're testing with in EKS?).
Also, what version of the EFS CSI storage driver are you using?
Sorry for the late reply @andrewlecuyer
the exact storage class you are using in your RKE environment
Unfortunately I can't b/c I don't have access to the cluster. I only deploy things via ArgoCD into RKE. I do know the NFS export has *(rw,no_root_squash,no_subtree_check)
so it "should" work but there are other problems preventing me from trying it out yet π
I'm assuming parameter such as uid and gid simply aren't set with the storage class you're testing with in EKS
Yes, it is linked above but here it is again. I don't like creating Access Points in EFS b/c they cause all sorts of issues. So the goal in EKS would be to not any uid:guid
and use the static option talked about here (as this ticket was trying to do)
Overview
This
postgres-startup
container install script line is reporting that no such file or directory exists when trying to create/pg16
subdir.However, the resulting pod spec after install mounts the NFS to
/pgdata
correctly and AFAICT the install script should handle the rest:click to expand the pod definition
creationTimestamp: "2024-07-08T01:20:33Z" generateName: example-example-h42f- labels: apps.kubernetes.io/pod-index: "0" controller-revision-hash: example-example-h42f-59955fdd4c postgres-operator.crunchydata.com/cluster: example postgres-operator.crunchydata.com/data: postgres postgres-operator.crunchydata.com/instance: example-example-h42f postgres-operator.crunchydata.com/instance-set: example postgres-operator.crunchydata.com/patroni: example-ha statefulset.kubernetes.io/pod-name: example-example-h42f-0 name: example-example-h42f-0 namespace: default ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: StatefulSet name: example-example-h42f uid: b8e14fdb-562e-4f1c-a9de-7db9ea4c965c resourceVersion: "2425768" uid: 22d5cce2-fa01-4fa1-8dfe-cbac22f041b2 spec: containers: - command: - patroni - /etc/patroni env: - name: PGDATA value: /pgdata/pg16 - name: PGHOST value: /tmp/postgres - name: PGPORT value: "5432" - name: KRB5_CONFIG value: /etc/postgres/krb5.conf - name: KRB5RCACHEDIR value: /tmp - name: PATRONI_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: PATRONI_KUBERNETES_POD_IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.podIP - name: PATRONI_KUBERNETES_PORTS value: | - name: postgres port: 5432 protocol: TCP - name: PATRONI_POSTGRESQL_CONNECT_ADDRESS value: $(PATRONI_NAME).example-pods:5432 - name: PATRONI_POSTGRESQL_LISTEN value: '*:5432' - name: PATRONI_POSTGRESQL_CONFIG_DIR value: /pgdata/pg16 - name: PATRONI_POSTGRESQL_DATA_DIR value: /pgdata/pg16 - name: PATRONI_RESTAPI_CONNECT_ADDRESS value: $(PATRONI_NAME).example-pods:8008 - name: PATRONI_RESTAPI_LISTEN value: '*:8008' - name: PATRONICTL_CONFIG_FILE value: /etc/patroni - name: LD_PRELOAD value: /usr/lib64/libnss_wrapper.so - name: NSS_WRAPPER_PASSWD value: /tmp/nss_wrapper/postgres/passwd - name: NSS_WRAPPER_GROUP value: /tmp/nss_wrapper/postgres/group image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /liveness port: 8008 scheme: HTTPS initialDelaySeconds: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: database ports: - containerPort: 5432 name: postgres protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readiness port: 8008 scheme: HTTPS initialDelaySeconds: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /etc/database-containerinfo name: database-containerinfo readOnly: true - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /etc/patroni name: patroni-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /dev/shm name: dshm - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -ceu - -- - |- monitor() { declare -r directory="/pgconf/tls" exec {fd}<> <(:) while read -r -t 5 -u "${fd}" || true; do if [ "${directory}" -nt "/proc/self/fd/${fd}" ] && install -D --mode=0600 -t "/tmp/replication" "${directory}"/{replication/tls.crt,replication/tls.key,replication/ca.crt} && pkill -HUP --exact --parent=1 postgres then exec {fd}>&- && exec {fd}<> <(:) stat --format='Loaded certificates dated %y' "${directory}" fi done }; export -f monitor; exec -a "$0" bash -ceu monitor - replication-cert-copy image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: replication-cert-copy resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - pgbackrest - server env: - name: LD_PRELOAD value: /usr/lib64/libnss_wrapper.so - name: NSS_WRAPPER_PASSWD value: /tmp/nss_wrapper/postgres/passwd - name: NSS_WRAPPER_GROUP value: /tmp/nss_wrapper/postgres/group image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-0 imagePullPolicy: IfNotPresent livenessProbe: exec: command: - pgbackrest - server-ping failureThreshold: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: pgbackrest resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/pgbackrest/server name: pgbackrest-server readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -ceu - -- - |- monitor() { exec {fd}<> <(:) until read -r -t 5 -u "${fd}"; do if [ "${filename}" -nt "/proc/self/fd/${fd}" ] && pkill -HUP --exact --parent=0 pgbackrest then exec {fd}>&- && exec {fd}<> <(:) stat --dereference --format='Loaded configuration dated %y' "${filename}" elif { [ "${directory}" -nt "/proc/self/fd/${fd}" ] || [ "${authority}" -nt "/proc/self/fd/${fd}" ] } && pkill -HUP --exact --parent=0 pgbackrest then exec {fd}>&- && exec {fd}<> <(:) stat --format='Loaded certificates dated %y' "${directory}" fi done }; export directory="$1" authority="$2" filename="$3"; export -f monitor; exec -a "$0" bash -ceu monitor - pgbackrest-config - /etc/pgbackrest/server - /etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt - /etc/pgbackrest/conf.d/~postgres-operator_server.conf image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-0 imagePullPolicy: IfNotPresent name: pgbackrest-config resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/pgbackrest/server name: pgbackrest-server readOnly: true - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: false hostname: example-example-h42f-0 initContainers: - command: - bash - -ceu - -- - |- declare -r expected_major_version="$1" pgwal_directory="$2" pgbrLog_directory="$3" permissions() { while [[ -n "$1" ]]; do set "${1%/*}" "$@"; done; shift; stat -Lc '%A %4u %4g %n' "$@"; } halt() { local rc=$?; >&2 echo "$@"; exit "${rc/#0/1}"; } results() { printf '::postgres-operator: %s::%s\n' "$@"; } recreate() ( local tmp; tmp=$(mktemp -d -p "${1%/*}"); GLOBIGNORE='.:..'; set -x chmod "$2" "${tmp}"; mv "$1"/* "${tmp}"; rmdir "$1"; mv "${tmp}" "$1" ) safelink() ( local desired="$1" name="$2" current current=$(realpath "${name}") if [ "${current}" = "${desired}" ]; then return; fi set -x; mv --no-target-directory "${current}" "${desired}" ln --no-dereference --force --symbolic "${desired}" "${name}" ) echo Initializing ... results 'uid' "$(id -u)" 'gid' "$(id -G)" results 'postgres path' "$(command -v postgres)" results 'postgres version' "${postgres_version:=$(postgres --version)}" [[ "${postgres_version}" =~ ") ${expected_major_version}"($|[^0-9]) ]] || halt Expected PostgreSQL version "${expected_major_version}" results 'config directory' "${PGDATA:?}" postgres_data_directory=$([ -d "${PGDATA}" ] && postgres -C data_directory || echo "${PGDATA}") results 'data directory' "${postgres_data_directory}" [[ "${postgres_data_directory}" == "${PGDATA}" ]] || halt Expected matching config and data directories bootstrap_dir="${postgres_data_directory}_bootstrap" [ -d "${bootstrap_dir}" ] && results 'bootstrap directory' "${bootstrap_dir}" [ -d "${bootstrap_dir}" ] && postgres_data_directory="${bootstrap_dir}" if [[ ! -e "${postgres_data_directory}" || -O "${postgres_data_directory}" ]]; then install --directory --mode=0700 "${postgres_data_directory}" elif [[ -w "${postgres_data_directory}" && -g "${postgres_data_directory}" ]]; then recreate "${postgres_data_directory}" '0700' else (halt Permissions!); fi || halt "$(permissions "${postgres_data_directory}" ||:)" results 'pgBackRest log directory' "${pgbrLog_directory}" install --directory --mode=0775 "${pgbrLog_directory}" || halt "$(permissions "${pgbrLog_directory}" ||:)" install -D --mode=0600 -t "/tmp/replication" "/pgconf/tls/replication"/{tls.crt,tls.key,ca.crt} [ -f "${postgres_data_directory}/PG_VERSION" ] || exit 0 results 'data version' "${postgres_data_version:=$(< "${postgres_data_directory}/PG_VERSION")}" [[ "${postgres_data_version}" == "${expected_major_version}" ]] || halt Expected PostgreSQL data version "${expected_major_version}" [[ ! -f "${postgres_data_directory}/postgresql.conf" ]] && touch "${postgres_data_directory}/postgresql.conf" safelink "${pgwal_directory}" "${postgres_data_directory}/pg_wal" results 'wal directory' "$(realpath "${postgres_data_directory}/pg_wal")" rm -f "${postgres_data_directory}/recovery.signal" - startup - "16" - /pgdata/pg16_wal - /pgdata/pgbackrest/log env: - name: PGDATA value: /pgdata/pg16 - name: PGHOST value: /tmp/postgres - name: PGPORT value: "5432" - name: KRB5_CONFIG value: /etc/postgres/krb5.conf - name: KRB5RCACHEDIR value: /tmp image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: postgres-startup resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -c - "export NSS_WRAPPER_SUBDIR=postgres CRUNCHY_NSS_USERNAME=postgres CRUNCHY_NSS_USER_DESC=\"postgres\" \n# Define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper. The\n# nss_wrapper_env.sh script (which also sets these vars) isn't sourced here since the nss_wrapper\n# has not yet been setup, and we therefore don't yet want the nss_wrapper vars in the environment.\nmkdir -p /tmp/nss_wrapper\nchmod g+rwx /tmp/nss_wrapper\n\nNSS_WRAPPER_DIR=\"/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}\"\nNSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_DIR}/passwd\"\nNSS_WRAPPER_GROUP=\"${NSS_WRAPPER_DIR}/group\"\n\n# create the nss_wrapper directory\nmkdir -p \"${NSS_WRAPPER_DIR}\"\n\n# grab the current user ID and group ID\nUSER_ID=$(id -u)\nexport USER_ID\nGROUP_ID=$(id -g)\nexport GROUP_ID\n\n# get copies of the passwd and group files\n[[ -f \"${NSS_WRAPPER_PASSWD}\" ]] || cp \"/etc/passwd\" \"${NSS_WRAPPER_PASSWD}\"\n[[ -f \"${NSS_WRAPPER_GROUP}\" ]] || cp \"/etc/group\" \"${NSS_WRAPPER_GROUP}\"\n\n# if the username is missing from the passwd file, then add it\nif [[ ! $(cat \"${NSS_WRAPPER_PASSWD}\") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then\n echo \"nss_wrapper: adding user\"\n passwd_tmp=\"${NSS_WRAPPER_DIR}/passwd_tmp\"\n cp \"${NSS_WRAPPER_PASSWD}\" \"${passwd_tmp}\"\n sed -i \"/${CRUNCHY_NSS_USERNAME}:x:/d\" \"${passwd_tmp}\"\n \ # needed for OCP 4.x because crio updates /etc/passwd with an entry for USER_ID\n sed -i \"/${USER_ID}:x:/d\" \"${passwd_tmp}\"\n printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${GROUP_ID}:${CRUNCHY_NSS_USER_DESC}:${HOME}:/bin/bash\\n' >> \"${passwd_tmp}\"\n envsubst < \"${passwd_tmp}\" > \"${NSS_WRAPPER_PASSWD}\"\n \ rm \"${passwd_tmp}\"\nelse\n echo \"nss_wrapper: user exists\"\nfi\n\n# if the username (which will be the same as the group name) is missing from group file, then add it\nif [[ ! $(cat \"${NSS_WRAPPER_GROUP}\") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then\n echo \"nss_wrapper: adding group\"\n group_tmp=\"${NSS_WRAPPER_DIR}/group_tmp\"\n \ cp \"${NSS_WRAPPER_GROUP}\" \"${group_tmp}\"\n sed -i \"/${CRUNCHY_NSS_USERNAME}:x:/d\" \"${group_tmp}\"\n printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${CRUNCHY_NSS_USERNAME}\\n' >> \"${group_tmp}\"\n envsubst < \"${group_tmp}\" > \"${NSS_WRAPPER_GROUP}\"\n \ rm \"${group_tmp}\"\nelse\n echo \"nss_wrapper: group exists\"\nfi\n\n# export the nss_wrapper env vars\n# define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper\nNSS_WRAPPER_DIR=\"/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}\"\nNSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_DIR}/passwd\"\nNSS_WRAPPER_GROUP=\"${NSS_WRAPPER_DIR}/group\"\n\nexport LD_PRELOAD=/usr/lib64/libnss_wrapper.so\nexport NSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_PASSWD}\"\nexport NSS_WRAPPER_GROUP=\"${NSS_WRAPPER_GROUP}\"\n\necho \"nss_wrapper: environment configured\"\n" image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: nss-wrapper-init resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true nodeName: ip-172-31-59-120.us-west-2.compute.internal preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 26 fsGroupChangePolicy: OnRootMismatch serviceAccount: example-instance serviceAccountName: example-instance shareProcessNamespace: true subdomain: example-pods terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 topologySpreadConstraints: - labelSelector: matchExpressions: - key: postgres-operator.crunchydata.com/data operator: In values: - postgres - pgbackrest matchLabels: postgres-operator.crunchydata.com/cluster: example maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway - labelSelector: matchExpressions: - key: postgres-operator.crunchydata.com/data operator: In values: - postgres - pgbackrest matchLabels: postgres-operator.crunchydata.com/cluster: example maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: ScheduleAnyway volumes: - name: cert-volume projected: defaultMode: 384 sources: - secret: items: - key: tls.crt path: tls.crt - key: tls.key path: tls.key - key: ca.crt path: ca.crt name: example-cluster-cert - secret: items: - key: tls.crt path: replication/tls.crt - key: tls.key path: replication/tls.key - key: ca.crt path: replication/ca.crt name: example-replication-cert - name: postgres-data persistentVolumeClaim: claimName: example-example-h42f-pgdata - downwardAPI: defaultMode: 420 items: - path: cpu_limit "/tmp/example.yaml" 727L, 25055B
Environment
EKS
v1.27.13
ubi8-16.3-3.4-0
16
NFS
Steps to Reproduce
EXPECTED
The
postgres-startup
container completes without error and installs everything into/pgdata/pg16
correctlyACTUAL
The
postgres-startup
container shows that it's not finding the bootstrap directoryThe pod definition below has a
volumeMount
entry for/pgdata
so that should definitely be created:click to expand the pod definition
creationTimestamp: "2024-07-08T01:20:33Z" generateName: example-example-h42f- labels: apps.kubernetes.io/pod-index: "0" controller-revision-hash: example-example-h42f-59955fdd4c postgres-operator.crunchydata.com/cluster: example postgres-operator.crunchydata.com/data: postgres postgres-operator.crunchydata.com/instance: example-example-h42f postgres-operator.crunchydata.com/instance-set: example postgres-operator.crunchydata.com/patroni: example-ha statefulset.kubernetes.io/pod-name: example-example-h42f-0 name: example-example-h42f-0 namespace: default ownerReferences: - apiVersion: apps/v1 blockOwnerDeletion: true controller: true kind: StatefulSet name: example-example-h42f uid: b8e14fdb-562e-4f1c-a9de-7db9ea4c965c resourceVersion: "2425768" uid: 22d5cce2-fa01-4fa1-8dfe-cbac22f041b2 spec: containers: - command: - patroni - /etc/patroni env: - name: PGDATA value: /pgdata/pg16 - name: PGHOST value: /tmp/postgres - name: PGPORT value: "5432" - name: KRB5_CONFIG value: /etc/postgres/krb5.conf - name: KRB5RCACHEDIR value: /tmp - name: PATRONI_NAME valueFrom: fieldRef: apiVersion: v1 fieldPath: metadata.name - name: PATRONI_KUBERNETES_POD_IP valueFrom: fieldRef: apiVersion: v1 fieldPath: status.podIP - name: PATRONI_KUBERNETES_PORTS value: | - name: postgres port: 5432 protocol: TCP - name: PATRONI_POSTGRESQL_CONNECT_ADDRESS value: $(PATRONI_NAME).example-pods:5432 - name: PATRONI_POSTGRESQL_LISTEN value: '*:5432' - name: PATRONI_POSTGRESQL_CONFIG_DIR value: /pgdata/pg16 - name: PATRONI_POSTGRESQL_DATA_DIR value: /pgdata/pg16 - name: PATRONI_RESTAPI_CONNECT_ADDRESS value: $(PATRONI_NAME).example-pods:8008 - name: PATRONI_RESTAPI_LISTEN value: '*:8008' - name: PATRONICTL_CONFIG_FILE value: /etc/patroni - name: LD_PRELOAD value: /usr/lib64/libnss_wrapper.so - name: NSS_WRAPPER_PASSWD value: /tmp/nss_wrapper/postgres/passwd - name: NSS_WRAPPER_GROUP value: /tmp/nss_wrapper/postgres/group image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 3 httpGet: path: /liveness port: 8008 scheme: HTTPS initialDelaySeconds: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 name: database ports: - containerPort: 5432 name: postgres protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /readiness port: 8008 scheme: HTTPS initialDelaySeconds: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 5 resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /etc/database-containerinfo name: database-containerinfo readOnly: true - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /etc/patroni name: patroni-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /dev/shm name: dshm - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -ceu - -- - |- monitor() { declare -r directory="/pgconf/tls" exec {fd}<> <(:) while read -r -t 5 -u "${fd}" || true; do if [ "${directory}" -nt "/proc/self/fd/${fd}" ] && install -D --mode=0600 -t "/tmp/replication" "${directory}"/{replication/tls.crt,replication/tls.key,replication/ca.crt} && pkill -HUP --exact --parent=1 postgres then exec {fd}>&- && exec {fd}<> <(:) stat --format='Loaded certificates dated %y' "${directory}" fi done }; export -f monitor; exec -a "$0" bash -ceu monitor - replication-cert-copy image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: replication-cert-copy resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - pgbackrest - server env: - name: LD_PRELOAD value: /usr/lib64/libnss_wrapper.so - name: NSS_WRAPPER_PASSWD value: /tmp/nss_wrapper/postgres/passwd - name: NSS_WRAPPER_GROUP value: /tmp/nss_wrapper/postgres/group image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-0 imagePullPolicy: IfNotPresent livenessProbe: exec: command: - pgbackrest - server-ping failureThreshold: 3 periodSeconds: 10 successThreshold: 1 timeoutSeconds: 1 name: pgbackrest resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/pgbackrest/server name: pgbackrest-server readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -ceu - -- - |- monitor() { exec {fd}<> <(:) until read -r -t 5 -u "${fd}"; do if [ "${filename}" -nt "/proc/self/fd/${fd}" ] && pkill -HUP --exact --parent=0 pgbackrest then exec {fd}>&- && exec {fd}<> <(:) stat --dereference --format='Loaded configuration dated %y' "${filename}" elif { [ "${directory}" -nt "/proc/self/fd/${fd}" ] || [ "${authority}" -nt "/proc/self/fd/${fd}" ] } && pkill -HUP --exact --parent=0 pgbackrest then exec {fd}>&- && exec {fd}<> <(:) stat --format='Loaded certificates dated %y' "${directory}" fi done }; export directory="$1" authority="$2" filename="$3"; export -f monitor; exec -a "$0" bash -ceu monitor - pgbackrest-config - /etc/pgbackrest/server - /etc/pgbackrest/conf.d/~postgres-operator/tls-ca.crt - /etc/pgbackrest/conf.d/~postgres-operator_server.conf image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.51-0 imagePullPolicy: IfNotPresent name: pgbackrest-config resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /etc/pgbackrest/server name: pgbackrest-server readOnly: true - mountPath: /etc/pgbackrest/conf.d name: pgbackrest-config readOnly: true - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true dnsPolicy: ClusterFirst enableServiceLinks: false hostname: example-example-h42f-0 initContainers: - command: - bash - -ceu - -- - |- declare -r expected_major_version="$1" pgwal_directory="$2" pgbrLog_directory="$3" permissions() { while [[ -n "$1" ]]; do set "${1%/*}" "$@"; done; shift; stat -Lc '%A %4u %4g %n' "$@"; } halt() { local rc=$?; >&2 echo "$@"; exit "${rc/#0/1}"; } results() { printf '::postgres-operator: %s::%s\n' "$@"; } recreate() ( local tmp; tmp=$(mktemp -d -p "${1%/*}"); GLOBIGNORE='.:..'; set -x chmod "$2" "${tmp}"; mv "$1"/* "${tmp}"; rmdir "$1"; mv "${tmp}" "$1" ) safelink() ( local desired="$1" name="$2" current current=$(realpath "${name}") if [ "${current}" = "${desired}" ]; then return; fi set -x; mv --no-target-directory "${current}" "${desired}" ln --no-dereference --force --symbolic "${desired}" "${name}" ) echo Initializing ... results 'uid' "$(id -u)" 'gid' "$(id -G)" results 'postgres path' "$(command -v postgres)" results 'postgres version' "${postgres_version:=$(postgres --version)}" [[ "${postgres_version}" =~ ") ${expected_major_version}"($|[^0-9]) ]] || halt Expected PostgreSQL version "${expected_major_version}" results 'config directory' "${PGDATA:?}" postgres_data_directory=$([ -d "${PGDATA}" ] && postgres -C data_directory || echo "${PGDATA}") results 'data directory' "${postgres_data_directory}" [[ "${postgres_data_directory}" == "${PGDATA}" ]] || halt Expected matching config and data directories bootstrap_dir="${postgres_data_directory}_bootstrap" [ -d "${bootstrap_dir}" ] && results 'bootstrap directory' "${bootstrap_dir}" [ -d "${bootstrap_dir}" ] && postgres_data_directory="${bootstrap_dir}" if [[ ! -e "${postgres_data_directory}" || -O "${postgres_data_directory}" ]]; then install --directory --mode=0700 "${postgres_data_directory}" elif [[ -w "${postgres_data_directory}" && -g "${postgres_data_directory}" ]]; then recreate "${postgres_data_directory}" '0700' else (halt Permissions!); fi || halt "$(permissions "${postgres_data_directory}" ||:)" results 'pgBackRest log directory' "${pgbrLog_directory}" install --directory --mode=0775 "${pgbrLog_directory}" || halt "$(permissions "${pgbrLog_directory}" ||:)" install -D --mode=0600 -t "/tmp/replication" "/pgconf/tls/replication"/{tls.crt,tls.key,ca.crt} [ -f "${postgres_data_directory}/PG_VERSION" ] || exit 0 results 'data version' "${postgres_data_version:=$(< "${postgres_data_directory}/PG_VERSION")}" [[ "${postgres_data_version}" == "${expected_major_version}" ]] || halt Expected PostgreSQL data version "${expected_major_version}" [[ ! -f "${postgres_data_directory}/postgresql.conf" ]] && touch "${postgres_data_directory}/postgresql.conf" safelink "${pgwal_directory}" "${postgres_data_directory}/pg_wal" results 'wal directory' "$(realpath "${postgres_data_directory}/pg_wal")" rm -f "${postgres_data_directory}/recovery.signal" - startup - "16" - /pgdata/pg16_wal - /pgdata/pgbackrest/log env: - name: PGDATA value: /pgdata/pg16 - name: PGHOST value: /tmp/postgres - name: PGPORT value: "5432" - name: KRB5_CONFIG value: /etc/postgres/krb5.conf - name: KRB5RCACHEDIR value: /tmp image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: postgres-startup resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /pgconf/tls name: cert-volume readOnly: true - mountPath: /pgdata name: postgres-data - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true - command: - bash - -c - "export NSS_WRAPPER_SUBDIR=postgres CRUNCHY_NSS_USERNAME=postgres CRUNCHY_NSS_USER_DESC=\"postgres\" \n# Define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper. The\n# nss_wrapper_env.sh script (which also sets these vars) isn't sourced here since the nss_wrapper\n# has not yet been setup, and we therefore don't yet want the nss_wrapper vars in the environment.\nmkdir -p /tmp/nss_wrapper\nchmod g+rwx /tmp/nss_wrapper\n\nNSS_WRAPPER_DIR=\"/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}\"\nNSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_DIR}/passwd\"\nNSS_WRAPPER_GROUP=\"${NSS_WRAPPER_DIR}/group\"\n\n# create the nss_wrapper directory\nmkdir -p \"${NSS_WRAPPER_DIR}\"\n\n# grab the current user ID and group ID\nUSER_ID=$(id -u)\nexport USER_ID\nGROUP_ID=$(id -g)\nexport GROUP_ID\n\n# get copies of the passwd and group files\n[[ -f \"${NSS_WRAPPER_PASSWD}\" ]] || cp \"/etc/passwd\" \"${NSS_WRAPPER_PASSWD}\"\n[[ -f \"${NSS_WRAPPER_GROUP}\" ]] || cp \"/etc/group\" \"${NSS_WRAPPER_GROUP}\"\n\n# if the username is missing from the passwd file, then add it\nif [[ ! $(cat \"${NSS_WRAPPER_PASSWD}\") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then\n echo \"nss_wrapper: adding user\"\n passwd_tmp=\"${NSS_WRAPPER_DIR}/passwd_tmp\"\n cp \"${NSS_WRAPPER_PASSWD}\" \"${passwd_tmp}\"\n sed -i \"/${CRUNCHY_NSS_USERNAME}:x:/d\" \"${passwd_tmp}\"\n \ # needed for OCP 4.x because crio updates /etc/passwd with an entry for USER_ID\n sed -i \"/${USER_ID}:x:/d\" \"${passwd_tmp}\"\n printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${GROUP_ID}:${CRUNCHY_NSS_USER_DESC}:${HOME}:/bin/bash\\n' >> \"${passwd_tmp}\"\n envsubst < \"${passwd_tmp}\" > \"${NSS_WRAPPER_PASSWD}\"\n \ rm \"${passwd_tmp}\"\nelse\n echo \"nss_wrapper: user exists\"\nfi\n\n# if the username (which will be the same as the group name) is missing from group file, then add it\nif [[ ! $(cat \"${NSS_WRAPPER_GROUP}\") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then\n echo \"nss_wrapper: adding group\"\n group_tmp=\"${NSS_WRAPPER_DIR}/group_tmp\"\n \ cp \"${NSS_WRAPPER_GROUP}\" \"${group_tmp}\"\n sed -i \"/${CRUNCHY_NSS_USERNAME}:x:/d\" \"${group_tmp}\"\n printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${CRUNCHY_NSS_USERNAME}\\n' >> \"${group_tmp}\"\n envsubst < \"${group_tmp}\" > \"${NSS_WRAPPER_GROUP}\"\n \ rm \"${group_tmp}\"\nelse\n echo \"nss_wrapper: group exists\"\nfi\n\n# export the nss_wrapper env vars\n# define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper\nNSS_WRAPPER_DIR=\"/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}\"\nNSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_DIR}/passwd\"\nNSS_WRAPPER_GROUP=\"${NSS_WRAPPER_DIR}/group\"\n\nexport LD_PRELOAD=/usr/lib64/libnss_wrapper.so\nexport NSS_WRAPPER_PASSWD=\"${NSS_WRAPPER_PASSWD}\"\nexport NSS_WRAPPER_GROUP=\"${NSS_WRAPPER_GROUP}\"\n\necho \"nss_wrapper: environment configured\"\n" image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres-gis:ubi8-16.3-3.4-0 imagePullPolicy: IfNotPresent name: nss-wrapper-init resources: {} securityContext: allowPrivilegeEscalation: false capabilities: drop: - ALL privileged: false readOnlyRootFilesystem: true runAsNonRoot: true terminationMessagePath: /dev/termination-log terminationMessagePolicy: File volumeMounts: - mountPath: /tmp name: tmp - mountPath: /var/run/secrets/kubernetes.io/serviceaccount name: kube-api-access-xvwgp readOnly: true nodeName: ip-172-31-59-120.us-west-2.compute.internal preemptionPolicy: PreemptLowerPriority priority: 0 restartPolicy: Always schedulerName: default-scheduler securityContext: fsGroup: 26 fsGroupChangePolicy: OnRootMismatch serviceAccount: example-instance serviceAccountName: example-instance shareProcessNamespace: true subdomain: example-pods terminationGracePeriodSeconds: 30 tolerations: - effect: NoExecute key: node.kubernetes.io/not-ready operator: Exists tolerationSeconds: 300 - effect: NoExecute key: node.kubernetes.io/unreachable operator: Exists tolerationSeconds: 300 topologySpreadConstraints: - labelSelector: matchExpressions: - key: postgres-operator.crunchydata.com/data operator: In values: - postgres - pgbackrest matchLabels: postgres-operator.crunchydata.com/cluster: example maxSkew: 1 topologyKey: kubernetes.io/hostname whenUnsatisfiable: ScheduleAnyway - labelSelector: matchExpressions: - key: postgres-operator.crunchydata.com/data operator: In values: - postgres - pgbackrest matchLabels: postgres-operator.crunchydata.com/cluster: example maxSkew: 1 topologyKey: topology.kubernetes.io/zone whenUnsatisfiable: ScheduleAnyway volumes: - name: cert-volume projected: defaultMode: 384 sources: - secret: items: - key: tls.crt path: tls.crt - key: tls.key path: tls.key - key: ca.crt path: ca.crt name: example-cluster-cert - secret: items: - key: tls.crt path: replication/tls.crt - key: tls.key path: replication/tls.key - key: ca.crt path: replication/ca.crt name: example-replication-cert - name: postgres-data persistentVolumeClaim: claimName: example-example-h42f-pgdata - downwardAPI: defaultMode: 420 items: - path: cpu_limit "/tmp/example.yaml" 727L, 25055B
Logs
See above in "Actual"
Additional Information
The EKS EFS CSI Driver by default creates Access Points that restrict read/writes to specific UID and GID. But I'm not creating access points so EFS mounts can have containers
chown|chmod
to their heart's content