kedacore / keda

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes
https://keda.sh
Apache License 2.0
8.52k stars 1.08k forks source link

Support IRSA for SQS Scalar #837

Closed NasAmin closed 3 years ago

NasAmin commented 4 years ago

A clear and concise description of what you want to happen.

We use EKS as our kubernetes cluster. To allow our pods to authenticate against AWS to access AWS services, we use IAM Roles for Service Accounts (IRSA). We'd like to use the same approach on the KEDA operator so the scalar can get AWS authentication from the operator.

Specification

arning  FailedGetExternalMetric       81s (x60 over 16m)  horizontal-pod-autoscaler  unable to get external metric default/AWS-SQS-Queue-ApproximateNumberOfMessages-cluster-AuditEventsQueue/&LabelSelector{MatchLabels:map[string]string{deploymentName: my-deployment,},MatchExpressions:[],}: unable to fetch metrics from external metrics API: No matching metrics found for aws-sqs-queue-approximatenumberofmessages-cluster-auditeventsqueue
 {"level":"debug","ts":1589916310.2131512,"logger":"scalehandler","msg":"Error getting scale decision","ScaledObject.Namespace":"default","ScaledObject.Name":"my-sqs-queue-scaledobject","ScaledObjec ││ t.ScaleType":"deployment","Error":"WebIdentityErr: unable to read file at /var/run/secrets/eks.amazonaws.com/serviceaccount/token\ncaused by: open /var/run/secrets/eks.amazonaws.com/serviceaccount/token: p ││ ermission denied"}

I suspect this may be because the SQS scalar isn't using the right SDK version

I'd really appreciate some help with this.

Regards

Nas

ahmelsayed commented 4 years ago

according to go.sum, we're using aws-sdk-go v1.25.6 which should be fine. The error message is about not being able to find file /var/run/secrets/eks.amazonaws.com/serviceaccount/token, which according to this, should be injected into the deployment in the form of

  env:
  - name: AWS_ROLE_ARN
    value: arn:aws:iam::123456789012:role/eksctl-irptest-addon-iamsa-default-my-serviceaccount-Role1-UCGG6NDYZ3UE
  - name: AWS_WEB_IDENTITY_TOKEN_FILE
    value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
  volumeMounts:
  - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
      name: aws-iam-token
      readOnly: true
volumes:
- name: aws-iam-token
  projected:
    defaultMode: 420
    sources:
    - serviceAccountToken:
        audience: sts.amazonaws.com
        expirationSeconds: 86400
        path: token

If you do

kubectl get deployment keda-operator -n keda -o yaml

do you see those added on the deployment?

NasAmin commented 4 years ago

@ahmelsayed Thanks for the quick response. Yes EKS automatically injects AWS credentials into the pod (not deployment). When I describe the keda operator pod, I get the following

    env:
    - name: WATCH_NAMESPACE
    - name: POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: OPERATOR_NAME
      value: keda-operator
    - name: AWS_ROLE_ARN
      value: arn:aws:iam::032356282346:role/nas-dev-pod-sqs-access
    - name: AWS_WEB_IDENTITY_TOKEN_FILE
      value: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    image: docker.io/kedacore/keda:1.4.1
    imagePullPolicy: Always
    name: keda-operator
    resources: {}
    securityContext: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: keda-operator-token-crlp9
      readOnly: true
    - mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
      name: aws-iam-token
      readOnly: true

As you can see the pod does seem to the the right credentials mounted to it. So it would seem that the pod does not actually has permissions to access that path. When I try to ssh into the pod and go to that pod and try to view that path I get a permissios denied

bash-4.4$ ls /var/run/secrets/eks.amazonaws.com/serviceaccount/token
/var/run/secrets/eks.amazonaws.com/serviceaccount/token
bash-4.4$ cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token
cat: /var/run/secrets/eks.amazonaws.com/serviceaccount/token: Permission denied
bash-4.4$ cat /var/run/secrets/eks.amazonaws.com/serviceaccount/token

So I am not really sure what I can do. I would appreciate any help

Thanks

NasAmin commented 4 years ago

I would really appreciate if anyone can help. Currently we are using a custom auto scalar based on a public GitHub repo. We'd like to get away from that and use KEDA where possible.

I originally created this issue as a feature request but it seems like IRSA should already be supported. Can it be changed to a defect?

Regards,

Nas

ahmelsayed commented 4 years ago

I wonder if it's the same issue as https://github.com/aws/amazon-eks-pod-identity-webhook/issues/8

what do you see if you run

$ id

$ ls -alh /var/run/secrets/eks.amazonaws.com/serviceaccount/

keda container doesn't run as root by default

There is a workaround descriped here https://github.com/kubernetes-sigs/external-dns/pull/1185#issuecomment-530439786 but I haven't verified it.

RaymondKYLiu commented 4 years ago

Hi,

I have ran into IRSA problem with Grafana, I am not sure KEDA is similar to it ? https://github.com/grafana/grafana/issues/20473#issuecomment-638268104

The solution is to add securityContext.

Could you try to add to KEAD operator ?

securityContext:
  fsGroup: 1001
  runAsGroup: 1001
  runAsUser: 1001
ben11211 commented 4 years ago

Can confirm this works with IRSA with the following

zroubalik commented 4 years ago

@ben11211 thanks! Would you mind contributing this info to the Troubleshooting guide? Thanks!

https://keda.sh/docs/2.0/troubleshooting/

mzupan commented 3 years ago

The one thing that got me using the helm chart was thinking that worked setting the context's here

https://github.com/kedacore/charts/blob/master/keda/values.yaml#L76

That sets the context for the containers but the securitycontexts need to go in the section for the pod. I had to fork the helm chart to make that change

zroubalik commented 3 years ago

@mzupan keen to send a PR for this?

NasAmin commented 3 years ago

Sorry for taking such a long time to get back to this. I can confim that setting podSecurityContext fixes my problem Given that:

Closing this issue

newb1e commented 4 months ago

In my case I was missing "identityOwner: keda" on the TriggerAuthentication object which made it work.

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: keda-trigger-auth-aws-credentials
  namespace: default
spec:
  podIdentity:
    provider: aws
    identityOwner: keda