Open jeffgus opened 1 month ago
Hi, We have the same (or similar issue). We running OKD on AWS. OKD Version: 4.15.0-0.okd-2024-03-10-010116
Log:
time="2024-10-29T13:20:06Z" level=info msg="crunchy-pgbackrest starts"
time="2024-10-29T13:20:06Z" level=info msg="debug flag set to false"
time="2024-10-29T13:20:06Z" level=info msg="backrest backup command requested"
time="2024-10-29T13:20:06Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1]"
time="2024-10-29T13:20:07Z" level=info msg="output=[]"
time="2024-10-29T13:20:07Z" level=info msg="stderr=[ERROR: [031]: option 'repo1-s3-key-type' is 'web-id' but 'AWS_ROLE_ARN' and 'AWS_WEB_IDENTITY_TOKEN_FILE' are not set\n]"
time="2024-10-29T13:20:07Z" level=fatal msg="command terminated with exit code 31"
But these system evironments are in place. I checked in a debug pod. I also checked web id token and role with aws cli and i was able to upload files to the bucket.
Can somebody help? It seems that the error message is missleading and there are other issues behind the scene. But without proper log message we cannot contiunue debugging.
Thanks, Jvincze84
I think the issue is how the backup runs. When I set the annotation, the cronjob runs with the AWS_ROLE_ARN, etc set. When I remove the "volume" from the s3 repo definition, the operator complains:
Stanza not created for \"repo2\" as specified for a scheduled backup
I don't think s3 repo's should have a volume section, but I can't make the operator write out the config without one. When it has a volume, then it interacts with the repo host which does NOT have AWS_ROLE_ARN set.
Overview
I'm unable to get the backup to S3 to work with a service account and IAM role (IRSA).
Environment
Steps to Reproduce
Create an IAM role in AWS with a Trust Relationship. Make sure that the ServiceAccounts are annotated. set: repo2-s3-key-type = web-id set bucket name, region, and endpoint.
I set s3.conf to be:
I'm not sure if these settings belong in the s3.conf file or the main config file. I've tried both.
EXPECTED
The pgbackrest should be able to find the token to commicate with the s3 bucket.
ACTUAL
I get one of two errors. I get an error saying that AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE env vars are missing. If I override the metadata for all serviceaccounts and edit the StatefulSet for the repo-host settings the serviceAccountName, that error goes away. It is replaced with:
command terminated with exit code 29: ERROR: [029]: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
Logs
command terminated with exit code 31: ERROR: [031]: option 'repo2-s3-key-type' is 'web-id' but 'AWS_ROLE_ARN' and 'AWS_WEB_IDENTITY_TOKEN_FILE' are not set
or
command terminated with exit code 29: ERROR: [029]: unable to find child 'AssumeRoleWithWebIdentityResult':0 in node 'ErrorResponse'
Additional Information
This is similar to #3135 and #3472, but these issues are old and things have changed.
I tried to tweak the role trust relationship rule and it doesn't seem to make a difference. I can run a container with awscli with the same serviceAccount and it works fine.
I can also try to run pgbackrest on the repo-node manually. It fails to properly backup (which is expected), bit it DOES communicate with S3 and creates the
backup.info
file.What is the correct configuration for this to work?