planetscale / vitess-operator

Kubernetes Operator for Vitess
Apache License 2.0
304 stars 75 forks source link

AWS IAM Role for Service Account support #383

Open klagroix opened 1 year ago

klagroix commented 1 year ago

We're trying to use vitess-operator with an S3 backup spec defined in our VitessCluster manifest. Example as follows:

spec:  
  backup:
    engine: xtrabackup
    locations:
      - s3:
          bucket: my-s3-bucket
          region: us-east-1

From the logs, it appears that vttablet attempts to read the backup S3 bucket to see it it needs to restore from the latest backup. As such, we need to provide AWS credentials to allow the pod access to read the S3 bucket.

Typically, we use IAM Roles for Service Accounts (IRSA) which allows annotating a Service Account with an IAM Role arn: https://docs.aws.amazon.com/eks/latest/userguide/associate-service-account-role.html

The issue here is that vitess-operator seems to see these AWS_ environment variables on the pod and attempts to $patch: delete these variables.

Example log taken from vitess-operator:

time="2023-02-14T22:12:16Z" level=info msg="Updating object in place" diff="metadata:\n  annotations:\n    planetscale.com/observed-shard-generation: \"1\"\n    rollout.planetscale.com/scheduled: |\n      spec:\n        containers:\n        - $setElementOrder/env:\n          - name: VTROOT\n          - name: VTDATAROOT\n          - name: VT_MYSQL_ROOT\n          - name: MYSQL_FLAVOR\n          - name: EXTRA_MY_CNF\n          - name: POD_IP\n          env:\n          - $patch: delete\n            name: AWS_DEFAULT_REGION\n          - $patch: delete\n            name: AWS_REGION\n          - $patch: delete\n            name: AWS_ROLE_ARN\n          - $patch: delete\n            name: AWS_STS_REGIONAL_ENDPOINTS\n          - $patch: delete\n            name: AWS_WEB_IDENTITY_TOKEN_FILE\n          name: vttablet\n        - $setElementOrder/env:\n          - name: VTROOT\n          - name: VTDATAROOT\n          - name: VT_MYSQL_ROOT\n          - name: MYSQL_FLAVOR\n          - name: EXTRA_MY_CNF\n          - name: POD_IP\n          env:\n          - $patch: delete\n            name: AWS_DEFAULT_REGION\n          - $patch: delete\n            name: AWS_REGION\n          - $patch: delete\n            name: AWS_ROLE_ARN\n          - $patch: delete\n            name: AWS_STS_REGIONAL_ENDPOINTS\n          - $patch: delete\n            name: AWS_WEB_IDENTITY_TOKEN_FILE\n          name: mysqld\n        - $setElementOrder/env:\n          - name: DATA_SOURCE_NAME\n          env:\n          - $patch: delete\n            name: AWS_DEFAULT_REGION\n          - $patch: delete\n            name: AWS_REGION\n          - $patch: delete\n            name: AWS_ROLE_ARN\n          - $patch: delete\n            name: AWS_STS_REGIONAL_ENDPOINTS\n          - $patch: delete\n            name: AWS_WEB_IDENTITY_TOKEN_FILE\n          name: mysqld-exporter\n        initContainers:\n        - env: null\n          name: init-vt-root\n        - env: null\n          name: init-mysql-socket\n" gvk="/v1, Kind=Pod" key=dev/dev-vitess-vttablet-useast1a-OMITTED

Is there any way to force vitess-operator to ignore the AWS_* environment variable discrepancies?

For context, these tests were performed on an AWS-hosted EKS cluster using vitess-operator v2.9.0-rc1

GuptaManan100 commented 1 year ago

How are you adding the environment variables? Are you adding them to the pod directly? Did you try adding these environment variables to the tabletPools config as extraEnv? You can specify extra environnment variables there and these are added to all the mysqld and vttablet pods.

klagroix commented 1 year ago

Hello, when using AWS EKS and IAM Roles for Service Accounts (IRSA), service accounts are annotated with an IAM role ARN.

EKS automatically injects AWS_* environment variables into pods that are using the service account.

As these AWS_* values are dynamic, I cannot add these as an extraEnv.

I did try add fake/placeholder values for these environment variables in extraEnv however vitess-operator still saw a difference and attempted to re-create the pods.

It would be nice if we could tell vitess-operator to ignore specific environment variables from the diff.