Open nnachefski opened 1 year ago
Which version of PGO and OpenShift are you using?
We are experiences the same on multiple our of OKD clusters (OKD 4.9, 4.10, 4.11) and PGO 5.x.x. But we can limit the issue scope, this only happens when you run multiple services which are using the "default" namespace. For any reason if installing PGO in a naked namespace, then the default serviceAccount works.
The (manual) solution to run it with other services in the same namespace, is to set the serviceAccount to the generated on of PGO in the "repo-host" statefullset.
Hi!
I'm experiencing the same issue. The postgresCluster CR has no property for setting serviceAccount for pgbackrest. So I have to assing SCCs to the default serviceaccount. Running OKD 4.11 operator 5.3.0
This issue is still happening on OKD 4.12 with CrunchyData 5.3.0
The problem manifests itself in the pgbackrest-log-dir initContainer.
Here is the work-around for now: (change the sts and serviceAccount name to whatever your's is called)
oc patch sts airflow-repo-host --type=merge -p '{"spec":{"template":{"spec":{"initContainers":[{"name":"pgbackrest-log-dir"}],"serviceAccountName":"airflow-pgbackrest"}}}}'
Thank you @nnachefski, it helped me a lot. The pod can work, backups are fine, but it cannot write logs, because the openshift uid doesn't have write access to the log dir: sh-4.4$ ls -la /pgbackrest/repo1/log/ total 0 drwxr-xr-x. 2 26 26 0 Jun 5 10:06 .
I think uid of postgres user is 26 in the image, but we use openshift uid here.
I just wanted to highlight this, if someone like me find this issue and WA. I hope Crunchy will fix this soon.
I am having a question related to the use of this service account. The repo-host pod is using the default service account in my case and I am getting the error
option 'repo1-s3-key-type' is 'web-id' but 'AWS_ROLE_ARN' and 'AWS_WEB_IDENTITY_TOKEN_FILE' are not set
I believe it has to do with the pod using the default service account whilst the other pod is using the helix-instance service account (my Postgres cluster is called helix) which does have the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE. Is this pod meant to be using the default service account or the helix-instance service account like the other pod. I am using AWS EKS trying to backup to AWS s3. Please help me
I'm experiencing the same issue. I tested an awscli pod with the service account I created for the purpose of using with the operator. I've used plenty of IRSA pods in other places and I know it works.
I can't get it to work with pgbackrest. I can override the repo-host-0 pod with the serviceAccountName, but I still get the error:
option 'repo2-s3-key-type' is 'web-id' but 'AWS_ROLE_ARN' and 'AWS_WEB_IDENTITY_TOKEN_FILE' are not set
If I apply the metadata to all the service accounts so that the database pods are now using a service account with the AWS_ROLE_ARN set, I get this error:
ERROR: [029]: unable to find child 'AssumeRoleWithWebIdentityResult':0
I tried to fiddle with the Trust Releationship configuration (see: #3135), but that doesn't seem to fix it.
I created a DB using CrunchyData (named "tracking"), but i also have "anyuid" policy set for the project's 'default' ServiceAccount. The initContainer ("pgbackrest-log-dir") in the "tracking-repo-host" StatefulSet failed to deploy citing:
mkdir: cannot create directory ‘/pgbackrest/repo1/log’: Permission denied
If i remove the 'anyuid' ClusterRoleBinding from the 'default' serviceAccount and try again it works fine.
-Nick