aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 318 forks source link

EKS IAM Roles for Service Accounts (Pods) #23

Closed pauncejones closed 5 years ago

pauncejones commented 5 years ago

Update 1/9/19:

After talking about this internally, we've been working on a proposed solution for this. Below is a writeup on what we're thinking, and we've included some example scripts so you can get a feel for how we expect this to work.

Our plan for IAM and Kubernetes integration

A recent Kubernetes feature, TokenRequestProjection, allows users of Kubernetes to mount custom projected service account tokens in their pods. A “projected service account” is a bearer token that is intended for use outside of the cluster. Conveniently, these projected service account tokens are also valid OpenID Connect (OIDC) tokens. AWS IAM has supported OIDC as a federated identity provider since 2014, which has allowed customers to use an external identity to assume an IAM role.

By combining these two features, an application running in a pod can pass the projected service account token along with a role ARN to the STS API AssumeRoleWithWebIdentity, and get back temporary role credentials! In order for this to work properly, there is some setup required to create an OIDC provider, and update an IAM role's trust policy so that the Kubernetes service account for a particular cluster is permitted to assume the role.

Some of the advantages to this approach are that any pod (including host pods) can assume a role, there is not a reliance on Kubernetes annotations for security, there are not any extra processes that need to be run on nodes, and you will be able to have nodes without any IAM permissions of their own.

In the coming months we will be building out functionality in EKS to create and manage OIDC providers for EKS clusters, as well as configuring IAM roles that can be used in an EKS cluster. We will also be adding support for this authentication mechanism in the AWS SDKs.

Totally open for comments, questions or suggestions on this -- let us know in the comments!

Micah Hausler (@micahhausler), System Development Engineer on EKS

christopherhein commented 5 years ago

~Exciting to see this get so much attention. Here is an implementation that was brought up in sig-aws back in July of this year, those of you interested if you want to provide feedback it will help to guide the implementation. https://github.com/kubernetes/community/pull/2329~

We'll publish more about our approach soon.

:+1:

gtaylor commented 5 years ago

Ahh, I was looking for that.

Will that KEP eventually me moved to https://github.com/kubernetes/enhancements ? It looks like https://github.com/kubernetes/community/pull/2329 was closed due to KEPs being moved out to k/enhancements. Seems to have halted discussion and consideration.

christopherhein commented 5 years ago

@gtaylor that was actually incorrect. Sorry about that. That was another implementation from the community. We'll have more details about our implementation coming out soon. Sorry for the confusion.

cpaika commented 5 years ago

Big fan of this - our organization can't adopt EKS until this is resolved.

sbkg0002 commented 5 years ago

Same here, glad this is shared upfront.

007 commented 5 years ago

https://github.com/jtblin/kube2iam

gtaylor commented 5 years ago

@007 kube2iam can not handle rapid pod churn and lacks some controls for selectively limiting metadata server exposure. It is not a complete, final solution to this problem.

Source: have used kube2iam in production at a large scale.

Vlaaaaaaad commented 5 years ago

@gtaylor : did you try kiam too? Did you find a workaround for the rapid pod churn issues?

I'm in the process of implementing some very spiky workloads and I'm trying to prepare the best I can.

gtaylor commented 5 years ago

I think we are going to stick it out for the "final" solution (the one this issue is tracking).

We had looked at kiam but aren't hurting badly enough to the point of having to make such a large change (for us). That might change, though. Kiam is probably where we'll go if we end up in a spot where kube2iam becomes untenable.

oulydna commented 5 years ago

my EKS friends, any rough ETA on this one?

micahhausler commented 5 years ago

@realAndyLuo "Working On It" https://github.com/aws/containers-roadmap/projects/1 :)

oulydna commented 5 years ago

thanks @micahhausler . Does "Working On it" come with any target date? or too much a spoiler to ask for

skyzyx commented 5 years ago

@realAndyLuo: Never. As a former Amazonian, I can tell you that it'll be ready when it's ready. "Working on it" is as close as you'll ever get to a time commitment.

Cheers. 👍

mikkeloscar commented 5 years ago

I have been working on a replacement for kube2iam/kiam in the form of https://github.com/mikkeloscar/kube-aws-iam-controller. Currently it has only focused on robustness and doesn't have features to restrict what roles you can request within a cluster (there are open issues for that). It also only works with some of the AWS SDKs but eliminates all the race conditions which are inherit in the design of kube2iam and kiam.

Maybe it's interesting for some of you.

christopherhein commented 5 years ago

Updated description by @micahhausler

cc @gtaylor @cpaika @sbkg0002 @realAndyLuo @007 @mikkeloscar

cullenmcdermott commented 5 years ago

The new proposal looks interesting. Quick question though, how would I get/distribute the tokens? Would each token map to one role in IAM?

mikkeloscar commented 5 years ago

By combining these two features, an application running in a pod can pass the projected service account token along with a role ARN to the STS API AssumeRoleWithWebIdentity, and get back temporary role credentials! In order for this to work properly, there is some setup required to create an OIDC provider, and update an IAM role's trust policy so that the Kubernetes service account for a particular cluster is permitted to assume the role.

Does this mean that applications have to actively implement this, or would the AWS SDK automatically do it? What I wanted to avoid with https://github.com/mikkeloscar/kube-aws-iam-controller is that applications needs to implement a custom SDK setup for running on Kubernetes. It should just work out of the box whether you run the application on bare EC2 or on Kubernetes or any other AWS like environment IMO. If this is not the case, then there will be a long tail of open source applications which needs to be updated to support this.

micahhausler commented 5 years ago

@cullenmcdermott

The new proposal looks interesting. Quick question though, how would I get/distribute the tokens? Would each token map to one role in IAM?

Projected service account tokens are issued via the API server, and mounted via the kubelet. You can add a projected token today on newer versions of Kubernetes by using the projected volume type.

kind: Pod
apiVersion: v1
metadata: 
  name: pod-name
  namespace: default
spec:
  serviceAccountName: default
  containers: 
  - name: container-name
    image: container-image:version
    volumeMounts:
    - mountPath: "/var/run/secrets/something/serviceaccount/"
      name: projected-token
  volumes:
  - name: projected-token
    projected:
      sources:
      - serviceAccountToken:
          audience: "client-id"
          expirationSeconds: 86400
          path: token 

The thinking right now is you would add an annotation to either the ServiceAccount or the Pod (not totally decided yet) with the IAM role ARN, and the token volume, volumeMount, and required env AWS environment variables (variable names TBD, but the SDKs will need a role ARN and token path) would get added via a mutating webhook.

On a high level the user workflow would look like this:

@mikkeloscar

Does this mean that applications have to actively implement this, or would the AWS SDK automatically do it?

It would be automatic with new versions of the SDK.

pingles commented 5 years ago

This sounds cool, we'll definitely be looking to adopt (I say that as one of the creators of https://github.com/uswitch/kiam) 😀 Glad to see this in the roadmap.

Given the SDK update requirement we'd probably have to run side-by-side for a while as all our teams update their apps and libs etc but sounds like that's doable too so all good to me. Thanks to the team there for thinking on it and not just taking the first suggestion!

mustafaakin commented 5 years ago

Would it be possible without upgrading all AWS SDK? It would be nice that if this component of the SDKs, at least for Java, be a seperate component until we can upgrade?

micahhausler commented 5 years ago

@mustafaakin for applications that couldn't transition right away, you could run a sidecar that would perform the sts:AssumeRoleWithWebIdentity call and expose those credentials on a localhost HTTP endpoint within the pod. You'd have to configure the application container to use the sidecar by setting the environment variable AWS_CONTAINER_CREDENTIALS_FULL_URI.

gtaylor commented 5 years ago

Does this also apply to both/boto3?

micahhausler commented 5 years ago

Yes, pretty much any SDK within the last 2 years would have AWS_CONTAINER_CREDENTIALS_FULL_URI support.

mikkeloscar commented 5 years ago

@mustafaakin for applications that couldn't transition right away, you could run a sidecar that would perform the sts:AssumeRoleWithWebIdentity call and expose those credentials on a localhost HTTP endpoint within the pod. You'd have to configure the application container to use the sidecar by setting the environment variable AWS_CONTAINER_CREDENTIALS_FULL_URI.

Isn't this just a recipe for race conditions? :) If your application container starts and requests the IAM role before the sidecar container has done assumeRole, then your application fails to get the credentials.

micahhausler commented 5 years ago

@mikkeloscar You are right, but I would also say it depends on the implementation of the application. Most AWS SDK's have a retry for metadata credential fetching, and some applications may not initialize the AWS SDK at startup. For those that do and exit, Kubernetes should restart that container while still bringing the sidecar online. It is not the optimal solution, but for cases where an newer SDK update is not immediately available, it could work.

schlomo commented 5 years ago

@mikkeloscar @micahhausler what is for me - as a K8S and AWS user - important is that it "just works" from a usage perspective. Same as aws sts get-caller-identity just works on EC2, I expect the same inside a K8S pod.

My biggest concern with the approach in this issue is the question, if all the AWS SDKs support re-reading the credential file from time to time or if there are some out there that assume that the content of a credential file never changes. IMHO this is the big benefit of the EC2 metadata interface, everybody using it knows that the information and credentials obtained from it are temporary.

In a previous setting we made a good experience with pre-fetching of IAM credentials as a solution to the 1-second-timeout in the AWS SDKs.

chrisz100 commented 5 years ago

General question: is this going to be Eks only or will you opensource the solution to be deployed on custom kubernetes installations on aws as well?

micahhausler commented 5 years ago

EKS will have automatic setup, but this will have the capability work with clusters on any provider.

arminc commented 5 years ago

I was wondering does this mean we will be able to use a role in another account or will I still need to do a secondary step and assume the role by my self (inside the container for example)? It sounds like cross account role assumption might work which would be nice.

micahhausler commented 5 years ago

Cross-account role assumption is definitely be possible, but there would be some setup required in the other account. You'd have to create an OpenID connect provider in the second account referencing cluster's Issuer URL, and update the trust policy on any roles in the second account allowing the ServiceAccount identity to assume it.

aavileli commented 5 years ago

Without this feature we cannot move to EKS

rtkgjacobs commented 5 years ago

kiam has been working superbly for us, but looking forward to a native AWS solution in EKS. Ideally not requiring ' modification' or newer AWS SDK's fingers crossed - or i'd likely have us hold off moving from kiam until most assets out there are including using newer AWS SDK's.

whereisaaron commented 5 years ago

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

kiam components also require you to maintain internal client/server certificates, but there didn't seem to be a mechanism to rotate them? How do you handle that?

adaniline-traderev commented 5 years ago

@whereisaaron We had to run kiam server and agents in privileged mode using host network. We use cert-manager to generate cliant/server certificates, which automatically renews them. We are yet to see if kiam processes will require a restart after certificates are renewed

rtkgjacobs commented 5 years ago

I also used cert-manager to manage the certs for kiam. Kiam does not have ionotif / auto reload if the certs change, and I don't think it had a /reload style http hook for a sidecar to do so easily. We set a cert timespan to hopefully be longer than AWS's native IAM solution hitting EKS for the immediate. Would be an ideal design pattern for it to support.

byrneo commented 5 years ago

@micahhausler will the OIDC provider (which gets auto-created by AWS) be dynamically configured as a federated IDP in the user account's IAM service? If so, would it be correct to think that there will be no 'actual' OIDC protocol interactions between k8s and the OIDC IDP? sts:AssumeRoleWithWebIdentity with the required parameters is all we'll need? .

btw: thanks for having this discussion in the open!

micahhausler commented 5 years ago

@byrneo Correct, the OIDC IDP will get created in the user's IAM service, but no actual OAuth2 flow will happen. EKS will host an IDP for the .well-known/openid-configuration and jwks_uri bits, but that will be transparent to the user, just so STS can verify the signing key RSA/ECDSA key.

lstoll commented 5 years ago

@micahhausler will the discovery bits be publicly exposed as well? We're already using OIDC-esque methods for auth outside of AWS, if we could re-use this it would be a huge win.

micahhausler commented 5 years ago

Yep! That is the plan. That way, you could configure projected tokens with alternate audiences (aka client_id) for use with other systems

gtaylor commented 5 years ago

Whoa, that's great. So to verify understanding: we're using Okta + OIDC on our self-hosted clusters now. In the future, EKS would allow us to continue using this in addition to STS?

micahhausler commented 5 years ago

@gtaylor For pods to authenticate with outside OIDC systems, yes. (This is not for user auth using OIDC to the Kubernets API server)

geota commented 5 years ago

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

We just managed split node groups by using taints, tolerations, and node selectors.

pawelprazak commented 5 years ago

@geota were you able to prevent running pods on those nodes (by adding tolerations and selectors) with PSP or an admission controller?

because normally any user can potentially add proper tolerations and/or selectors and run on any node pool

rtkgjacobs commented 5 years ago

Hi @rtkgjacob, sounds positive. Last I looked at kaim it expected the server component to run on master nodes, or at least not a node with the kaim agent running. How do you handle that with EKS?

I managed to build a configuration that autoscaled from our dev instances to prod. You can get both the agent and server (kiam) wise both run on a single EKS worker node (since AWS does not let you put anything on the control plane masters) by setting both to use 'host' networking container wise. There are considerations doing this, YMMV etc.

Here is an example kiam server pod configuration:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: kube-system
  name: kiam-server
spec:
  replicas: 1
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: server
    spec:
      # We need to use node host network to bypass kiam-agents iptables re-routing
      hostNetwork: true          < --- key emphasis
      serviceAccountName: kiam-server
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-server-tls
      containers:
        - name: kiam
          image: quay.io/uswitch/kiam:v3.0-rc1
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - server
            - --json-log
            - --level=info
            - --bind=0.0.0.0:443
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt            
            - --role-base-arn-autodetect
            - --sync=1m
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
          livenessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=5s
              - --timeout=5s
            initialDelaySeconds: 10
            periodSeconds: 10
            timeoutSeconds: 10
          readinessProbe:
            exec:
              command:
              - /kiam
              - health
              - --cert=/etc/kiam/tls/tls.crt
              - --key=/etc/kiam/tls/tls.key
              - --ca=/etc/kiam/tls/ca.crt
              - --server-address=localhost:443
              - --gateway-timeout-creation=5s
              - --timeout=5s
            initialDelaySeconds: 3
            periodSeconds: 10
            timeoutSeconds: 10

And here is an example client pod config (also can run on same eks worker node)

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  namespace: kube-system
  name: kiam-agent        
spec:
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9620"
      labels:
        app: kiam
        role: agent
    spec:
      # We need to use node host network so this container can manipulate the host EC2 nodes iptables and intercept the meta-api calls
      hostNetwork: true        < --- key emphasis
      dnsPolicy: ClusterFirstWithHostNet
      volumes:
        - name: ssl-certs
          hostPath:
            path: /etc/pki/ca-trust/extracted/pem/
        - name: tls
          secret:
            secretName: kiam-server-tls
        - name: xtables
          hostPath:
            path: /run/xtables.lock
            type: FileOrCreate
      containers:
        - name: kiam
          securityContext:
            capabilities:
              add: ["NET_ADMIN"]      < -- important so  it can interact with iptables of host
          image: quay.io/uswitch/kiam:v3.0-rc1
          imagePullPolicy: Always
          command:
            - /kiam
          args:
            - agent
            - --iptables
            - --host-interface=!eth15   # https://github.com/uswitch/kiam/pull/112/files#r238483446
            - --json-log
            - --level=info
            - --port=8181
            - --cert=/etc/kiam/tls/tls.crt
            - --key=/etc/kiam/tls/tls.key
            - --ca=/etc/kiam/tls/ca.crt
            - --server-address=kiam-server:443
            - --prometheus-listen-addr=0.0.0.0:9620
            - --prometheus-sync-interval=5s
            - --gateway-timeout-creation=1s
          env:
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
          volumeMounts:
            - mountPath: /etc/ssl/certs
              name: ssl-certs
            - mountPath: /etc/kiam/tls
              name: tls
            - mountPath: /var/run/xtables.lock
              name: xtables
          livenessProbe:
            httpGet:
              path: /ping
              port: 8181
            initialDelaySeconds: 3
            periodSeconds: 3

Unless BOTH host and agent are both set to use 'host' networking, you can't expect both to collapse onto a single node.

Hope this helps. For us we wanted a design pattern that can deploy an EKS cluster with a single worker node in the ASG, and then as devs load more pods the K8 autoscaler will bring up more nodes via the AWS ASG - costs start low and can fan out automatically. Until AWS provides their native solutio and we can sunset KIAM ideally.

mustafaakin commented 5 years ago

We like to manage worker groups with subnet, security group and IAM segregations and keep a set of nodes to run prvilleged stuff (like kiam), or must work uninterrupted (like prometheus) and do not let anyone tu submit YAMLs to kubernetes itself, only via peer-reviewed automation.

aavileli commented 5 years ago

@rtkgjacobs . Its not a good idea to run both the agent and server on the same node. Also would you not give all nodes to assume the server IAM role which defeats the purpose of securing the pods

aavileli commented 5 years ago

@adaniline-traderev can you explain how you run kiam under eks

chrissnell commented 5 years ago

Can we take the kiam discussion somewhere else please? I want to keep watching this issue to track AWS's progress. I don't care about third-party efforts.

senyan commented 5 years ago

What is the estimated time of getting this released? Seems this is a major need as people are running more and more diverse and complicated workflows on Kubernetes.

awsiv commented 5 years ago

waiting for this eagerly :)