aws-controllers-k8s / community

AWS Controllers for Kubernetes (ACK) is a project enabling you to manage AWS services from Kubernetes
https://aws-controllers-k8s.github.io/community/
Apache License 2.0
2.39k stars 253 forks source link

IAM service controller #222

Closed max-lobur closed 1 year ago

max-lobur commented 4 years ago

Is your feature request related to a problem? With IAM Roles deploys we could implement https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html with CRDs and close the git-ops cycle of all the pods we run in EKS.

This is the most frequently deployed AWS resource. Everything else is just creating resources on-demand (like dynamo db table) using the IAM role or uses shared resources (like bucket subdir) using the IAM role.

Describe the solution you'd like A CRD will generate an IAM role from a template An IAM role ARN is returned back to a serviceaccount annotation RESULT: A Pod is deployed with its own IAM role binding fully thru helm chart

Describe alternatives you've considered Currently creating an IAM role with terraform and passing ARN manually to a helm vars.

jaypipes commented 4 years ago

I know Bob Wise is keen to see this happen, too. :)

The issue we have foreseen with having an ACK service controller for IAM is that the service controller's IAM role would essentially need to be a superuser/root to be able to create IAM roles and permission sets. We have designed ACK with a separate binary (Pod) for each individual AWS service API because we want to encourage the use of small-scoped IAM roles for Pods instead of requiring an IAM role that has either a very broad scope (read/write across lots of services) and/or very privileged (and creating IAM entities is a very privileged operation).

All that said, we hear you :) After I get the first 6 service controllers out the door (S3, ECR, SNS, SQS, DynamoDB, and API Gateway v2), I'll put some effort into a design doc that outlines a potential IAM service controller for ACK.

jbilliau-rcd commented 4 years ago

I concur; IAM roles are the only thing we need besides ECR (which is already supported) before we can deprecate our Terraform module that does these things and just bake this all into our Helm chart. Right now, developers have to bounce between TF and Helm, and we are trying to keep them in Helm. They are such a basic foundational thing I'm honestly surprised they aren't in the initial preview.

joberdick commented 4 years ago

I do get the concerns @jaypipes. Let us know if you need some feedback on controls.

mksh commented 4 years ago

The issue we have foreseen with having an ACK service controller for IAM is that the service controller's IAM role would essentially need to be a superuser/root to be able to create IAM roles and permission sets. We have designed ACK with a separate binary (Pod) for each individual AWS service API because we want to encourage the use of small-scoped IAM roles for Pods instead of requiring an IAM role that has either a very broad scope (read/write across lots of services) and/or very privileged (and creating IAM entities is a very privileged operation).

Actually, ACK does not need to be a system-wide supervisor/root, its access might be restricted to specific IAM resources only. One may use resource ARN prefix or Condition in order to separate permission grant for ACK-controlled IAM roles from the rest of IAM. For example, a permission grant statement for IAM controller might be:

    {
      "Effect": "Allow",
      "Action": [
         "iam:AttachRolePolicy",
         "iam:CreateRole",
         "iam:UpdateRole",
         "iam:DeleteRole",
         "iam:CreatePolicy",
         "iam:CreatePolicyVersion"
       ],
      "Resource": [
              "arn:aws:iam::account-id:role/k8s-controller-role-prefix-*",
              "arn:aws:iam::account-id:policy/k8s-controller-role-prefix-*"
        ],
    },

, so that all IAM Roles and Policies, which names are not starting with k8s-controller-role-prefix-, would not be controlled or allowed to control by ACK.

jaypipes commented 4 years ago

Great suggestions @mksh

danopia commented 4 years ago

Another measure to manage IAM safely would be the iam:PermissionsBoundary Condition Key. The administrator could create a service-focused boundary Role (allowing e.g. "Action": ["s3:*", "sns:*", "sqs:*"]) and enforce that the controller can only create roles when that boundary is included:

        {
            "Effect": "Allow",
            "Action": [
                "iam:DetachRolePolicy",
                "iam:CreateRole",
                "iam:AttachRolePolicy"
            ],
            "Resource": "arn:aws:iam::<acctid>:role/k8s-controller-role-prefix-*",
            "Condition": {
                "StringEquals": {
                    "iam:PermissionsBoundary": "arn:aws:iam::<acctid>:policy/k8s-iam-role-boundary"
                }
            }
        },

(based on this post)

Together with the rest of the example from @mksh you'll be able to lock down the allowed range of motion pretty tightly. Then it's a matter of figuring out how to document the install process (and the risks of winging the install) 😄

max-lobur commented 4 years ago

I wish there was a formal "Activation" feature in IAM. E.g. new/edited role is not active until manually activated by someone with permissions (separate iam:ActivateRole action). Kind of CFNs deletion protection but usage protection.

I know there's gates to get CRD into the cluster, ACK's own iam limits (set by us), but want something on AWS side too.

mjallday commented 4 years ago

@max-lobur if we added a default condition to the policy you could simulate this.

            "Condition": {
                "StringEquals": {"ec2:ResourceTag/Reviewed": "true"}
            }

i'm sure there are better ways than i've outlined above but hopefully the gist of what i'm suggesting is clear.

max-lobur commented 4 years ago

I know, I could write a lambda hook that adds this to everything new

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Action": [
        "*"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

but since it's not a standalone iam: action you can't prevent a compromised ACK from resetting it. Like there's no way to write a policy that lets create/edit other policies but doesn't let to delete these exact lines.

The only way is to have some standalone action in iam: space, not allowed to ACK but allowed to a human admin.

max-lobur commented 4 years ago

@jaypipes If ACK used a cloudformation underneath, it could leverage ChangeSets feature for an admin approval: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets.html

Then ACK would have permissions to CreateChangeSet and admin would have ExecuteChangeSet: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_awscloudformation.html

An SNS notification can be used to notify about incoming production changes reviews: https://docs.amazonaws.cn/en_us/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets-view.html

This would cover not only IAM but all other resources. Changesets are immutable so there's no space for a race here. I'd feel safer to give broader permissions to ACK in this case.

edify42 commented 4 years ago

Interesting idea. Another way to secure it could be to have a complimentary admissions controller which validated the types of policies that could be written.

Bit out of scope though for this project

jaypipes commented 4 years ago

@jaypipes If ACK used a cloudformation underneath, it could leverage ChangeSets feature for an admin approval: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets.html

Then ACK would have permissions to CreateChangeSet and admin would have ExecuteChangeSet: https://docs.aws.amazon.com/IAM/latest/UserGuide/list_awscloudformation.html

An SNS notification can be used to notify about incoming production changes reviews: https://docs.amazonaws.cn/en_us/AWSCloudFormation/latest/UserGuide/using-cfn-updating-stacks-changesets-view.html

This would cover not only IAM but all other resources. Changesets are immutable so there's no space for a race here. I'd feel safer to give broader permissions to ACK in this case.

Hi again @max-lobur, sorry for the delayed response!

We decided to not use CloudFormation behind the scenes primarily for two reasons:

1) We thought it was a surprising user experience. If a Kubernetes user created an S3 Bucket manifest and relied on ACK to create that S3 Bucket, we thought it was a surprising experience for that user to learn that it wasn't that an S3 Bucket was created, but rather a CloudFormationStack was created that itself created an S3 Bucket. Add to that some oddities about how some CloudFormation resources can or cannot be "detached" from a CloudFormationStack.

2) Kubernetes custom controllers must always consider Kubernetes to be the desired state of a resource. Otherwise, bad things happen with regards to different systems constantly competing to be the "source of truth" for desired state. CloudFormation considers its Stack definition as the source of truth for a resource's desired state (and indeed, CloudFormation considers the Stack itself as the primary resource for the purposes of state tracking, which is why to tell what the state of a resource is in CloudFormation, you are constantly polling for the Stack to reach CREATE_COMPLETE state).

ajcann commented 4 years ago

Any timeline estimates on IAM provisioning via ACK? This would be an incredibly helpful feature.

jaypipes commented 4 years ago

Any timeline estimates on IAM provisioning via ACK? This would be an incredibly helpful feature.

Hi! No, at this time I have not thought about a timeline for IAM provisioning via an ACK service controller. If you haven't already, please feel free to +1 this issue.

Even if we agreed to follow some of the suggestions from @mksh (which are great, btw), we would need to get an IAM service controller for ACK onto the release roadmap and then determine a likely milestone for an IAM controller to go into dev preview (and then target a GA release of the controller 3 months after we release in dev preview). Earliest I see being able to get an IAM controller for ACK into dev preview would probably be end of 2020, given the backlog of services we're currently slated to deliver these next four months.

Of course, squeaky wheel gets the grease, so no worries squeaking about this feature request AFAIC :)

negz commented 4 years ago

We've found that a lot of folks leverage Crossplane's support for IAM roles (and users, attachments, etc) in provider-aws. They're fairly critical when it comes to spinning up something like a functioning EKS cluster with all of its dependencies. We're hopeful that eventually Crossplane's provider-aws will be completely generated by the ACK codegen pipeline, and part of that would mean generating types and controllers for all AWS resources - including IAM.

ajcann commented 4 years ago

Thanks for the tip @negz

@jaypipes Thank you for the update. One thing I'm having trouble understanding is how folks are managing ACK created resources given IAM policies and roles are such an integral part of a functioning infrastructure. For example, If I create an S3 bucket as an object store for some application, the pod running that application presumably needs either a direct ARN annotation or IRSA with a role in order to access that S3 bucket, no? If I understand this correctly, it seems like IAM management would be fairly fundamental to all other managed and consumed AWS resources within a kubernetes cluster.

max-lobur commented 3 years ago

Found one implementation: https://github.com/redradrat/aws-iam-operator

TBBle commented 3 years ago

Being able to manage IRSA IAM Roles and ServiceAccounts through ACK would be my first-used feature.

That would let us move away from either manual (or scripted, e.g. Terraform) overlapping work in both IAM and k8s, or relying on eksctl. eksctl is what I currently use, but as well as requiring wider permissions for users than might be desired, it consumes a CloudFormation stack with each role created, which can soon exhaust your CF Stack limits when you are being very specific with your permissions, e.g. lots of S3 buckets, and hence require many IRSA-enabled ServiceAccounts.

The other use-case this enables would be that things like Cluster Autoscaler or ExternalDNS could have the appropriate CRD resource in their Helm chart, so that deploying CA with Helm could automatically create an IRSA-enabled ServiceAccount with the precisely-necessary permissions, moving that list of AWS permissions from the documentation to the code, and saving a bunch of research and typing in eksctl. I estimate 80% of my eksctl config by lines, and more than that by effort, was in writing the IRSA sections for the various applications that need to interact with AWS systems, like CA, ExternalDNS, AWS-CNI, AWS-CSI-EBS, etc, and after that I needed to copy that ServiceAccount name into the distinct Helm values for each application.

With IRSA support from an IAM ACK Controller, I could just specify in eksctl the IRSA details for IAM ACK Controller's Service Account itself (and once eksctl supports 'addons' for IRSA, this would be about three lines in the eksctl config), and let the ACK installation specify the IRSA details for the other controllers, and hopefully start adding appropriate resources to the Helm charts of other projects that require direct AWS API actions via a ServiceAccount.


It would also be nice if something like the ECR Repository resources could also take an optional ReadWriteServiceAccountName and ReadOnlyServiceAccountName and create appropriate ServiceAccounts in the same namespace, and IRSA Roles which specify a Resource already precisely restricted to the ECR Repository created by the ECR ACK Controller, presumably by creating resources for the IAM ACK Controller this ticket is about.

This would again combine nicely with something like ExternalDNS + a future Route53 ACK Controller to keep the entire k8s-managed zone defined in k8s, needing only a DNS delegation (NS Record) from the parent zone (Route53 or otherwise) to function.


This simple UX might not be flexible enough for systems with more-interesting permission setups, so perhaps a SASpec array which allows providing a template for Service Accounts, which can have the resources in the policy filled-in by the ACK Controller when creating the appropriate CRD, ala PodSpec in a Deployment.

The more-flexible SASpec approach is also more amenable to code-gen from the API than having specific fields like ReadOnlyServiceAccountName and ReadWriteServiceAccountName. I understand that the ACK Controller implementations are generated from API, rather than hand-written, where possible.


As far as CRD UX goes, I imagine being able to deploy a CRD resource to create an IRSA ServiceAccount with a definition as simple as eksctl's iam.serviceAccounts list elements. Being able to specify either ARNs or the actual policy document should be sufficiently flexible to meet all user's needs.

gokulpch commented 3 years ago

This is an other implementation: https://github.com/keikoproj/iam-manager This seems taking care of many other use-cases such as IRSA etc.

TBBle commented 3 years ago

With a brief look at the docs, that does seem like it generally fills the use-cases described here.

I think their security model is similar to that suggested by @danopia at https://github.com/aws-controllers-k8s/community/issues/222#issuecomment-677757650, and indeed hits the mentioned concern that installation isn't a simple Helm chart, but a shell script which slurps in admin permissions for both k8s and an AWS account in order to create the necessary PermissionsBoundary in CloudFormation, with a few "If you get this wrong, things will go weird" notes scattered through the docs.

I hope that implementation of this feature in ACK doesn't lead to an ecosystem split like IRSA/kiam, where projects that want to use this feature, e.g. including an IAM permissions template in a Helm chart, have to support two different mechanisms. If it's two different kinds of custom resource object, that's not too bad, and not really avoidable.

Jacob-Morgan commented 3 years ago

Putting this out there to speed to the ACK controller creation....

I came across the ACK Controller Family while researching alternative solutions managing IRSA and although it's still only proposed but on the roadmap, I've built an experimental version (https://github.com/pharos/ack-iam-controller) to see how it compares. It's limited to Roles, Policies and PolicyAttachments (Managed or Inline), but does offer enough to work through some of the use cases. It's more expressive than other solutions as it's a direct reflection of AWS SDK but in a review we felt this was better as nothing was hidden. In our case we used the ACK Controllers IRSA to control the role naming policy and to ensure a permissions boundary was configured.

Example

apiVersion: iam.services.k8s.aws/v1alpha1
kind: Role
metadata:
  name: k8s-service-name
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::{{ aws_account_id }}:role/k8s-service-name
spec:
  assumeRolePolicyDocument: >
    <json policy statement>
  permissionsBoundary: arn:aws:iam::{{ aws_account_id }}:policy/k8s-permissions-boundary
  roleName: k8s-service-name
---
apiVersion: iam.services.k8s.aws/v1alpha1
kind: RolePolicyAttachment
metadata:
  name: k8s-managed-policy
spec:
  roleName: k8s-service-name
  policyARN: arn:aws:iam::{{ aws_account_id }}:policy/k8s-managed-policy
---
apiVersion: iam.services.k8s.aws/v1alpha1
kind: RolePolicy
metadata:
  name: k8s-service-name-inline-policy
spec:
  policyDocument: >
    <json policy statement>
  policyName: k8s-service-name-inline-policy
  roleName: k8s-service-name
jaypipes commented 3 years ago

@Jacob-Morgan that is awesome, but the link is a 404. Maybe it's a private GH repo? :)

I'd be quite eager to chat with you on Kubernetes Slack provider-aws channel about your adventures if you'd be up for it? I'm on vacation until next Tuesday but would love to chat next week!

Jacob-Morgan commented 3 years ago

@jaypipes I've switched the GH repo to public so you can review and sure ping me on the slack channel.

jaypipes commented 3 years ago

@jaypipes I've switched the GH repo to public so you can review and sure ping me on the slack channel.

Great start @Jacob-Morgan :) Just want you to know I haven't forgotten about you. We're in the process of cleaning up and automating our release processes, and that's been taking up a chunk of time. Thanks for your patience!

ack-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with /close. Provide feedback via https://github.com/aws-controllers-k8s/community. /lifecycle stale

RedbackThomson commented 3 years ago

/lifecycle frozen

CarlosDomingues commented 2 years ago

That's a much needed component for folks trying to implement GitOps-ish, self-service pipelines for EKS.

Smana commented 2 years ago

Same here this would be helpful to define the IAM policies the same way we managed our apps. @Jacob-Morgan, I'm gonna try your controller, thanks for sharing.

jaypipes commented 2 years ago

@CarlosDomingues @Smana we hear you and we're working on the IAM controller. I'm hoping to have something to test out this year. Feel free to follow the progress on the iam-controller repo here: https://github.com/aws-controllers-k8s/iam-controller

doctorpangloss commented 2 years ago

Policies are consistently the most painful part of using AWS services this way. eksctl has bugs that are blocking how this is used. I think the security concerns are kind of abstract for a component like this, and it's better to just make something simple that works.

jaypipes commented 2 years ago

Policies are consistently the most painful part of using AWS services this way. eksctl has bugs that are blocking how this is used. I think the security concerns are kind of abstract for a component like this, and it's better to just make something simple that works.

@doctorpangloss I've published a preview version of the IAM controller, including support for Roles and Policies along with associating Policies with Roles.

See here: https://gallery.ecr.aws/aws-controllers-k8s/iam-controller

doctorpangloss commented 2 years ago

Policies are consistently the most painful part of using AWS services this way. eksctl has bugs that are blocking how this is used. I think the security concerns are kind of abstract for a component like this, and it's better to just make something simple that works.

@doctorpangloss I've published a preview version of the IAM controller, including support for Roles and Policies along with associating Policies with Roles.

See here: https://gallery.ecr.aws/aws-controllers-k8s/iam-controller

I would try this, I see that it meets my use case. But how do I create the role for this thing to work? We're so close to something great.

jaypipes commented 2 years ago

@doctorpangloss I've published a preview version of the IAM controller, including support for Roles and Policies along with associating Policies with Roles. See here: https://gallery.ecr.aws/aws-controllers-k8s/iam-controller

I would try this, I see that it meets my use case. But how do I create the role for this thing to work? We're so close to something great.

Ah, I see, you're referring to the "bootstrapping problem", right? In other words, a chicken-and-egg problem of needing an IAM Role already created for the Service Account of the IAM controller for ACK to run properly so that it can create other IAM Roles?

TBBle commented 2 years ago

See here: https://gallery.ecr.aws/aws-controllers-k8s/iam-controller

@jaypipes The links on that gallery page on the Usage tab are both wrong, they start with https://aws-controllers-k8s.github.io/community/user-docs/ but should start with https://aws-controllers-k8s.github.io/community/docs/user-docs/, i.e. there's a /docs missing in the path.

TBBle commented 2 years ago

@doctorpangloss https://github.com/aws-controllers-k8s/iam-controller/blob/v0.0.2/config/iam/recommended-inline-policy looks like the policy it needs.

You should be able to specify that for a service account named ack-iam-controller in eksctl in the namespace where you will install the controller, i.e.

iam:
  serviceAccounts:
  - metadata:
      name: ack-iam-controller
      namespace: <whatever namespace you're going to install the controller into>
    attachPolicy:
      Version: "2012-10-17"
      Statement:
      - Effect: Allow
        Action:
        - iam:GetRole
        - iam:CreateRole
        - iam:DeleteRole
        - iam:UpdateRole
        - iam:GetPolicy
        - iam:CreatePolicy
        - iam:DeletePolicy
        - iam:ListAttachedRolePolicies
        - iam:AttachRolePolicy
        - iam:DetachRolePolicy
        Resource: "*"

Then when installing with Helm, pass --set serviceAccount.create=false (or equivalent in your values.yaml) to Helm, so that it uses the existing account.

You can also use a different account name if so-desired.

Note: I haven't actually tested this directly, but I've done the same thing for other inline policies with IRSA in eksctl.

jaypipes commented 2 years ago

See here: https://gallery.ecr.aws/aws-controllers-k8s/iam-controller

@jaypipes The links on that gallery page on the Usage tab are both wrong, they start with https://aws-controllers-k8s.github.io/community/user-docs/ but should start with https://aws-controllers-k8s.github.io/community/docs/user-docs/, i.e. there's a /docs missing in the path.

Ah, thank you @TBBle for your eagle eyes! :) We will fix up the links ASAP.

afterhill commented 2 years ago

@max-lobur Hi, per checking the documentation mentioned (https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html), the required step yet includes to create the oidc provider. As the iam-controller is yet released status, does that mean we can don't have to set up the oidc provider webhook at all and can be totally replaced by the iam controller? It would be appreciated if any reference documentation can guide us there.

max-lobur commented 2 years ago

@max-lobur Hi, per checking the documentation mentioned (https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html), the required step yet includes to create the oidc provider. As the iam-controller is yet released status, does that mean we can don't have to set up the oidc provider webhook at all and can be totally replaced by the iam controller? It would be appreciated if any reference documentation can guide us there.

With EKS, the OIDC provider is created automatically https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html The manual shows how to fetch the existing provider URL

iam-controller proposed in this issue does not interfere with OIDC. The controller only creates the IAM role. Then the role is used via "IAM roles for service account" feature as usual

mikestef9 commented 1 year ago

Closing as IAM service controller has graduated to GA https://github.com/aws-controllers-k8s/iam-controller/releases/tag/v0.1.0. Separate issues can be opened to discuss specific follow on topics on the IAM controller.