aws / containers-roadmap

This is the public roadmap for AWS container services (ECS, ECR, Fargate, and EKS).
https://aws.amazon.com/about-aws/whats-new/containers/
Other
5.21k stars 317 forks source link

[EKS] [request]: Manage IAM identity cluster access with EKS API #185

Closed ayosec closed 9 months ago

ayosec commented 5 years ago

Tell us about your request

CloudFormation resources to register IAM roles in the aws-auth ConfigMap.

Which service(s) is this request for? EKS

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

A Kubernetes cluster managed by EKS is able to authenticate users with IAM roles. This is very useful to grant access to Lambda functions. However, as described in the documentation, every IAM role has to be registered manually in a ConfigMap with the name aws-auth.

For every IAM role we add to the CloudFormation stack, we have to add an entry like this:

mapRoles: |
  - rolearn: "arn:aws:iam::11223344:role/stack-FooBarFunction-AABBCCDD"
    username: lambdafoo
    groups:
      - system:masters
  - ...

This process is a bit tedious, and it is hard to automate.

It will be much better if those IAM roles can be registered directly in the CloudFormation template. For example, with something like this:

LambdaKubeUser:
  Type: AWS::EKS::MapRoles::Entry
  Properties:
    Cluster: !Ref EKSCluster
    RoleArn: !GetAtt FunctionRole.Arn
    UserName: lambdafoo
    Groups:
      - system:masters

Thus, CloudFormation will add and remove entries in the ConfigMap as necessary, with no extra manual steps.

Another AWS::EKS::MapUsers::Entry can be used to register IAM users in mapUsers.

With this addition, we can automate the extra step to register the IAM role of the worker nodes when a new EKS instance is created:

NodeInstanceKubeUser:
  Type: AWS::EKS::MapRoles::Entry
  Properties:
    Cluster: !Ref EKSCluster
    RoleArn: !GetAtt NodeInstanceRole.Arn
    UserName: system:node:{{EC2PrivateDNSName}}
    Groups:
      - system:bootstrappers
      - system:nodes
abelmokadem commented 5 years ago

@ayosec have you created something to automate this as of now? I'm running into this when setting up a cluster using CloudFormation. Do you mind sharing your current approach?

ayosec commented 5 years ago

have you created something to automate this as of now?

Unfortunately, no. I haven't found a reliable way to do it 100% automatic.

Do you mind sharing your current approach?

My current approach is to generate the ConfigMap using a template:

  1. All relevant ARNs are available in the outputs of the stack.
  2. A Ruby script reads those outputs, and fills a template.
  3. Finally, the generated YAML is applied with kubectl apply -f -.
dardanbekteshi commented 5 years ago

Adding this feature on CloudFormation would allow the same feature to be added on AWS CDK. This will greatly simplify the process of adding/removing new nodes, for example.

willejs commented 5 years ago

I also thought about this. An api to manage the config map for aws-iam-authenticator is interesting, i think would be a bit clunky. I am using terraform to create an eks cluster, and this approach is alot nicer https://github.com/terraform-aws-modules/terraform-aws-eks/pull/355

inductor commented 4 years ago

I'd love this

schlomo commented 4 years ago

Anybody from AWS care to comment on this feature request?

mikestef9 commented 4 years ago

With the release of Managed Nodes with CloudFormation support, EKS now automatically handles updating aws-auth config map for joining nodes to a cluster.

Does this satisfy the initial use case here, or is there a separate ask to manage adding users to the aws-auth config map via CloudFormation?

inductor commented 4 years ago

@mikestef9 I think that https://github.com/aws/containers-roadmap/issues/554 can be one of the similar issues why you would like this kind of options

ayosec commented 4 years ago

@mikestef9

Does this satisfy the initial use case here, or is there a separate ask to manage adding users to the aws-auth config map via CloudFormation?

My main use case is with Lambda functions.

The managed nodes feature is pretty cool, and very useful for new EKS clusters, but most of our modifications to the aws-auth ConfigMap are to add or remove roles for Lambda functions.

tnh commented 4 years ago

@mikestef9 It would be useful to then allow people / roles to be able to then run kubectl commands.

Right now, we have a CI deploy role - but we want to allow other saml based users to be able to kubectl

We do a post cluster creation to kubectl

apiVersion: v1
kind: ConfigMap
metadata:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    - rolearn: {{ ClusterAdminRoleArn }}
      username: system:node:{{ '{{EC2PrivateDNSName}}' }}
      groups:
        - system:bootstrappers
        - system:nodes
    - rolearn: arn:aws:iam::{{ AccountId }}:role/{{ AdminRoleName }}
      username: admin
      groups:
        - system:masters
    - rolearn: arn:aws:iam::{{ AccountId }}:role/{{ CIRoleName }}
      username: ci
      groups:
        - system:masters
    - rolearn: arn:aws:iam::{{ AccountId }}:role/{{ ViewRoleName }}
      username: view

But I'd much rather have this configMAp created by me during the cluster creation

nckturner commented 4 years ago

@mikestef9 Some relevant issues related to EKS users debugging authentication problems (https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/174 and https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/275) that imo are data points in favor of API and Cloudformation management of auth mappings (and configurable admin role: https://github.com/aws/containers-roadmap/issues/554).

nemo83 commented 4 years ago

This ^^, how can we get this implemented? Can anyone from AWS tell us if they gonna support this at CF template level? Or a workaround is needed at eksctl level?

inductor commented 4 years ago

@nemo83 AWS team tagged researching this issue, so that's the stage of this issue.

nicorikken commented 4 years ago

I'm also looking into automating the updates to this configmap from cloudformation. Doing so via lambda seems doable.

My main concern with automation are race-conditions on the contents of the configmap when applying updates as the content has to be parsed. A strategic merge is not possible. If the configuration would be implemented in one or more (one per entry) CRD's it would be easier to apply a patch. In that case existing efforts on Kubernetes support for CloudFormation like kubernetes-resources-provider can be reused.

Update: we gave up on writing a lambda to update the configmap. The code became too complex and fragile. We now template it separately.

Update 2: I had a concern for automatically updating the configmap if it would become corrupt and thereby prevent API access. With the current workings of AWS (1 sept 2020) there is a way of recovering from an aws-auth configmap corruption:

aws-auth configmap recovery (tested 1 sept 2020)

The prerequisite is to have a pod in the cluster running with a serviceaccount that can update the aws-auth configmap. Ideally something that you can interact with, like k8s-dashboard or in our case ArgoCD.

Then if the aws-auth become corrupt you can hopefully still update the configmap that way.

If that is not the case because the nodes have lost their access we can use the EKS-managed Node Group to restore node access to the Kubernetes API. You can create an EKS-managed Node Group of just 1 node with the role that is also used by your cluster nodes. (Note: this is not recommended by AWS, but we abuse AWS's power to update the configmap on the managed master nodes.)

AWS will now add this role to the aws-auth configmap:

apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
  name: aws-auth
  namespace: kube-system
data:
  mapRoles: |
    ...
    # this entry is added by AWS automically
    - rolearn: < NodeInstanceRole ARN >
      username: system:node:{{EC2PrivateDNSName}}
      groups:
      - system:bootstrappers
      - system:nodes

Deletion of that Node Group will remove that entry (for which AWS warns you), so the serviceaccount access is required to ensure another method of cluster access, like via the kubectl CLI. Update to aws-auth configmap to get that method of access. Then the Node Group can be removed, which in turn removes the added aws-auth configmap entry that was automatically created earlier. Now the persistent connection (e.g. kubectl CLI) can be used to permanently fix the configmap to ensure the nodes have access.

Note: if a service is automatically but incorrectly updating the configmap it would be harder, of not impossible to recover. ⚠

hellupline commented 3 years ago

I would go a extra mile and ask AWS to create an API to manage aws-auth, with IAM action associated

in case I delete the IAM role/user associated with the cluster creation ( detail: this user/role is not visible after, you have to save this info outside the cluster, or tag the cluster with it. )

and if I dont add another admin to the cluster, I am now locked out of the cluster,

for me, this is a major issue, because I use federated auth, users ( and my day-to-day account ) are efemeral... my user can be recreated without warning with another name/ID,

the ideia is: can AWS add an IAM action like ESHttpGet/ESHttpPost ? ( example from ElasticSearch, because is a third party software )

mikestef9 commented 3 years ago

Hey @hellupline

We are actually working on exactly that right now, an EKS API to manage IAM users and their permissions to an EKS cluster. This will allow you to manage IAM users via IaC tools like CloudFormation

yanivpaz commented 3 years ago

@mikestef9 how it's going to be different compared to https://github.com/aws-quickstart/quickstart-amazon-eks/blob/main/templates/amazon-eks-controlplane.template.yaml#L109

markussiebert commented 3 years ago

I wonder, why this isn't possible with eks clusters (but with selfhosted k8s clusters on AWS?)

https://github.com/kubernetes-sigs/aws-iam-authenticator#crd-alpha

Even looking at the cdk implementation of auth mapping, it would be simple to get rid of some limitations that exist right now (stack barrier, imported clusters ...)

So if something like CF Support for Auth-Mapping will be implemented (i support this) it would be good, if it won't conflict with the crd's I hope coming soon to eks.

gp42 commented 3 years ago

Hey @hellupline

We are actually working on exactly that right now, an EKS API to manage IAM users and their permissions to an EKS cluster. This will allow you to manage IAM users via IaC tools like CloudFormation

Any news on this issue?

vennemp commented 3 years ago

For those wanting to know, this is currently supported via aws cdk - they create a custom resource which invokes a lambda function, and updates the config map. A little hacky and glad I didn't have to write that code but it works like charm!

dany74q commented 3 years ago

@vennemp - Cool ! From what I gather though, that's only possible if it makes sense to assign said lambda the cluster-creation role; This might be problematic if the clusters were created by a 'plain' user, or by a highly-privileged role, which would be risky to assign.

I think that the 'lock-in' to the cluster creation role is problematic - especially if clusters are created in a non-centralized fashion; it would be awesome to have the ability to control permissions in AWS-alone - along w/ predefined roles one could grant (a-la Kubernetes Viewer / Editor / Owner)

This would also give one the ability to grant said roles to third parties for cluster-auditing, introspection, or just as an auth mechanism - that would work cluster-wide and not require a per-cluster tweak.

vennemp commented 3 years ago

@dany74q yes that is what it does and i agree wholeheartedly. It’s definitely a pain. Adding the ability to update the Config mp at launch to include other principals would be the best way One thing I’ve done is edit the trust relationship of the role used to create the cluster to include a Trust relationship with another principal - like ec2 or an aws account or another role(you can have more than one!). So I can use the role arn attribute in aws eks update-kubeconfig Command. That way I can just assume that original role.

mikestef9 commented 3 years ago

@dany74q

"it would be awesome to have the ability to control permissions in AWS-alone - along w/ predefined roles one could grant (a-la Kubernetes Viewer / Editor / Owner)"

This is exactly the API we are building that I referenced in a previous comment. We'll have a handful of pre-baked polices (Admin, Editor, Viewer) that you can attach through the API at a cluster wide or specific namespace level, without ever having to add anything in RBAC. You can still of course use RBAC if you need to define more fine-grained permissions per IAM entity.

tsndqst commented 3 years ago

@mikestef9 is there a separate issue in the roadmap for the API you mentioned or should we follow this issue for updates on that API?

mikestef9 commented 3 years ago

Let's use this issue, I updated the title

manuelcoppotelli commented 3 years ago

The aws-iam-authenticator has implemented the IAMIdentityMappings CRD.

May be a good idea to follow along with this solution and implement the same!

gunzy83 commented 3 years ago

Going to drop my $0.02 here and share the use case. We are rebuilding our IAM across all of our accounts with AWS SSO as the entry point to our accounts (console and CLI) so we don't have credentials to distribute anymore. As a part of that, we are trying to follow recommendations for everything we do (least privilege, etc) and Cloudformation service roles are a part of that.

I was talking with our SA contact from AWS about how the role that creates the cluster is the only principal that can perform the update to the config map but if that is a role we pass to Cloudformation when we are automatically deploying our entire stack, we don't want to allow our CI nodes to assume role to that role, update kubeconfig and be able to do everything system:masters can do (mind you it can now, we want to do better).

Our SA pointed me to https://github.com/aws-samples/eks-aws-auth-configmap as a starting point to create a custom resource and allow the lambda to assume the Cloudformation service role. This is better but still not ideal as the service role would have permission to create roles (with a boundary policy), access the KMS key created in Cloudformation, and more; quite a bit more permission than required.

I really hope this will be implemented in the API and eventually added to Cloudformation.

adammw commented 3 years ago

@mikestef9 is there further work ongoing on a different API for managing IAM identities, or is the IAMIdentityMappings CRD support essentially delivering this API change and there will just be other UI changes to ease the management of these resources?

reason I ask is that is eksctl work on any IAM identity mapping features is stalled until the new API design is announced: https://github.com/weaveworks/eksctl/issues/874

mikestef9 commented 3 years ago

It's a totally separate EKS API - not related to the IAMIdentityMappings CRD

dany74q commented 2 years ago

Hey @mikestef9 ! Wondered if there's an update on this, this is one of the most exiting features to be implemented in EKS, for our use cases at least - appreciating the complexity of implementing it at scale, a rough estimation (months / quarters away) would be amazing.

Thanks a ton !

gp42 commented 2 years ago

Here is my take on the challenge to manage the auth ConfigMap. Not exactly what is requested here - but may be helpful for some use-cases. The logic is kind of reverse to the one suggested here - the aws-auth-operator is synchronising IAM groups to the aws-auth configmap.

You can find it here: https://github.com/gp42/aws-auth-operator

matthewcummings commented 2 years ago

@mikestef9 any updates on this?

dukeluke16 commented 2 years ago

As part of an upcoming Audit, we require the ability to Identify the IAM Role that is associated with the cluster_creator. Whatever solution EKS provides for managing IAM identity cluster access via EKS API needs to explicitly support cluster_creator.

lucasff commented 2 years ago

I got bitten by this. It took me too long to figure out I needed to access the cluster via a completely different user, even though I have * access on AWS. It doesn't make sense for a managed service.

savagete2860 commented 2 years ago

I have recently started to use the AWS quick start EKS CloudFormation registry module. Once the third party registry is enabled the resource type AWSQS::EKS::Cluster which allows the management the aws-auth config map.

In addition to this, you can use this to add lambda role arns of bootstrap lambdas or whatever you like. It works well but it would be nice if this was a native CloudFormation feature.

raizyr commented 2 years ago

This ABSOLUTELY needs to be part of the EKS APIs, the idea that both AWS and the customer are trying to manage a single ConfigMap in a cluster is ridiculous and completely breaks any concept of a source of truth for what that ConfigMap should be.

jagu-sayan commented 2 years ago

With the number of thumbs up, I don't understand why it's not a priority. And it will symplify AWS CDK for EKS implementation (used custom resource for now) !

pavlospt commented 2 years ago

I believe it would be a quite good timing to have an official answer from AWS after almost 3 years 😛

vennemp commented 2 years ago

@pavlospt AWS: "It's on the 'road-map'"

Roberdvs commented 2 years ago

This is becoming even more relevant now that v18+ versions of the terraform-aws-eks module have removed support for managing the aws-auth configMap.

So it would be nice to do it via IaC on one go using the EKS API without having to setup the Kubernetes Terraform provider or any other workaround just to handle that one resource.

archoversight commented 2 years ago

Having it be an AWS resource would mean we could use IAM privs to control access to the resource, and it avoids the issue we have no where the IAM user/role that created the cluster has god-mode privileges in the cluster, but if that user/role ever goes away and you mess up the auth-map you lose access to your EKS cluster.

dudicoco commented 2 years ago

Having it be an AWS resource would mean we could use IAM privs to control access to the resource, and it avoids the issue we have no where the IAM user/role that created the cluster has god-mode privileges in the cluster, but if that user/role ever goes away and you mess up the auth-map you lose access to your EKS cluster.

You can actually just recreate the user/role in order to regain access to the EKS cluster.

Phylu commented 2 years ago

We are creating our EKS clusters using the terraform-aws-eks module. In order to properly setup credentials to access the Kubernetes cluster, we need to do the following:

This workflow is – unfortunately – not really feasible in an automated way when creating a new EKS cluster. It would be really helpful, if we can manage the access from the AWS side directly.

matthewcummings commented 2 years ago

@Phylu as @archoversight mentioned, that module no longer supports managing the aws-auth configmap.

This is issue is ridiculous fwiw. You typically use at least two toolsets with EKS: 1) Terraform, CloudFormation, Pulumi, etc to manage the cluster level resources and resources outside of the cluster and 2) Helm, Kustomize, etc to manage resources in the cluster.

This problem lands right in the intersection of those two sets of tools. Cluster access should really just be done seamlessly (to the user) via IAM and the AWS API. It's also crazy that the person who creates the cluster will ALWAYS have admin level access to cluster.

I'm not going to go "full conspiracy" here but this kind of problem makes it clear that people at AWS do NOT use EKS anywhere internally. Which is fine, but if one or two AWS teams had to deal with these problems then they would likely see how awful this is and do something about it.

Nuru commented 2 years ago

@Phylu wrote:

We are creating our EKS clusters using the terraform-aws-eks module.
... This workflow is – unfortunately – not really feasible in an automated way when creating a new EKS cluster. It would be really helpful, if we can manage the access from the AWS side directly.

@Phylu You can use Cloud Posse's terraform-aws-eks-cluster module instead. If you can run your Terraform in an environment where the official aws CLI is installed, then Cloud Posse's module will take care of the auth-map reasonably well. (It works without the aws CLI as well, but not as reliably.) Note that the Cloud Posse module still has edge cases where it requires manual intervention (for example when the initial destroy only partly succeeds, a subsequent destroy might fail due to failing authentication), and the better solution is to implement the API as requested in this issue. The Cloud Posse solution is nevertheless IMHO the best option available at this time for a fully automated solution.

@matthewcummings wrote:

I'm not going to go "full conspiracy" here but this kind of problem makes it clear that people at AWS do NOT use EKS anywhere internally. Which is fine, but if one or two AWS teams had to deal with these problems then they would likely see how awful this is and do something about it.

The AWS QuickStart team did something about this. The wrote a Lambda and implemented a custom CloudFormation provider: https://github.com/aws-quickstart/quickstart-amazon-eks-cluster-resource-provider#awsqsekscluster

bryantbiggs commented 2 years ago

We are creating our EKS clusters using the terraform-aws-eks module. ... This workflow is – unfortunately – not really feasible in an automated way when creating a new EKS cluster. It would be really helpful, if we can manage the access from the AWS side directly.

@Phylu You can use Cloud Posse's terraform-aws-eks-cluster module instead. If you can run your Terraform in an environment where the official aws CLI is installed, then Cloud Posse's module will take care of the auth-map reasonably well. (It works without the aws CLI as well, but not as reliably.) Note that the Cloud Posse module still has edge cases where it requires manual intervention (for example when the initial destroy only partly succeeds, a subsequent destroy might fail due to failing authentication), and the better solution is to implement the API as requested in this issue. The Cloud Posse solution is nevertheless IMHO the best option available at this time for a fully automated solution.

@aidanmelen has provided a module for managing the aws-auth configmap that works quite well to fill this gap until a native solution is provided https://github.com/aidanmelen/terraform-aws-eks-auth

mfin commented 2 years ago

Yes. Another thumb up for @aidanmelen's module. Migrated two clusters to using it. A nifty approach with Jobs, kudos!

matthewcummings commented 2 years ago

To be clear, this functionality should be native to the AWS API. Custom terraform modules (that leverage the K8s provider)and custom Cfn resources are all nice and good but this should be an API level operation. Anything short of that is a compromise.

I've been using AWS for years, ECS (Fargate and EC2), Lambda, EKS and of course tons of other services, but this is a "leaky abstraction" at best. Bonafide API support makes it possible to do this in CloudFormation and Terraform (with just the AWS provider) and Pulumi and everything else. I'm not heartless but I also expect more from AWS on this front.

aidanmelen commented 2 years ago

To be clear, this functionality should be native to the AWS API. Custom terraform modules (that leverage the K8s provider)and custom Cfn resources are all nice and good but this should be an API level operation. Anything short of that is a compromise.

I've been using AWS for years, ECS (Fargate and EC2), Lambda, EKS and of course tons of other services but this is a "leaky abstraction" at best. There should be API level support for assigning cluster IAM permissions. Bonafide API support makes it possible to do this in CloudFormation and Terraform (with just the AWS provider) and Pulumi and everything else. I'm not heartless but I also expect a lot from AWS.

Yes. AWS needs to make an API for this.

andre-lx commented 2 years ago

Hey guys.

I'm searching everywhere and can't find the answer, right now, how can we accomplish this?

So, I'm creating a private cluster through CloudFormation and want to give access to a role used by an ECS Task. The security groups are configured and now the ECS task can access the cluster (network), but I'm receiving the "error: You must be logged in to the server (Unauthorized)" as expected.

There are any chance of setting this role to the "aws-auth" configmap in order to make this connection possible?

Thanks