kubernetes-sigs / cluster-api-provider-aws

Kubernetes Cluster API Provider AWS provides consistent deployment and day 2 operations of "self-managed" and EKS Kubernetes clusters on AWS.
http://cluster-api-aws.sigs.k8s.io/
Apache License 2.0
637 stars 562 forks source link

Specific role and instance profile for EKS worker nodes. #3557

Open faiq opened 2 years ago

faiq commented 2 years ago

/kind feature

Describe the solution you'd like [A clear and concise description of what you want to happen.]

As a EKS cluster operator, I'd like to have a specific set of roles for my AWS worker nodes use.

Currently the role AWSIAMRoleNodes is a shared role for workers that are for regular clusters and EKS clusters.

  AWSIAMRoleNodes:
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Action:
          - sts:AssumeRole
          Effect: Allow
          Principal:
            Service:
            - ec2.amazonaws.com
        Version: 2012-10-17
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      RoleName: nodes.cluster-api-provider-aws.sigs.k8s.io
    Type: AWS::IAM::Role

My proposal is to create a new Instance Profile and Role specific for nodes that are part of an EKS cluster and remove the ManagedPolicyArns in the role pasted above

This would make the AWSIAMRoleNodes to be the following

  AWSIAMRoleNodes:
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Action:
          - sts:AssumeRole
          Effect: Allow
          Principal:
            Service:
            - ec2.amazonaws.com
        Version: 2012-10-17
      RoleName: nodes.cluster-api-provider-aws.sigs.k8s.io
    Type: AWS::IAM::Role

and a new role called eks-nodes.cluster-api-provider-aws.sigs.k8s.io which would like this

  AWSIAMRoleEKSNodes:
    Properties:
      AssumeRolePolicyDocument:
        Statement:
        - Action:
          - sts:AssumeRole
          Effect: Allow
          Principal:
            Service:
            - ec2.amazonaws.com
        Version: 2012-10-17
      ManagedPolicyArns:
      - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
      - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy
      RoleName: eks-nodes.cluster-api-provider-aws.sigs.k8s.io

Along with a new InstanceProfile to use that Role as follows

  AWSIAMInstanceProfileEKSNodes:
    Properties:
      InstanceProfileName: eks-nodes.cluster-api-provider-aws.sigs.k8s.io
      Roles:
      - Ref: AWSIAMRoleEKSNodes
    Type: AWS::IAM::InstanceProfile

I think this will warrant code changes in clusterawsadm as well as the tests to use this newly created role+instance profile https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/eb5da5870f9147624430de1b67e55843991ed7d0/test/e2e/data/eks/cluster-template-eks-machine-deployment-only.yaml#L32

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

faiq commented 2 years ago

In looking at this we would need to modify this to use the new role as well https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/93897be636031fac812765d95a018fe61dbd689f/pkg/cloud/services/iamauth/reconcile.go#L51

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

faiq commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

dlipovetsky commented 1 year ago

From office hours 01/01/23: Makes sense to scope down permissions of non-EKS clusters.

/triage accepted /priority important-longterm /help

k8s-ci-robot commented 1 year ago

@dlipovetsky: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-aws/issues/3557): >From office hours 01/01/23: Makes sense to scope down permissions of non-EKS clusters. > >/triage accepted >/priority important-longterm >/help Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
fatz commented 1 year ago

The hard coded instance profile of cluster-template-eks-machine-deployment-only.yaml introduces several issues:

https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/eb5da5870f9147624430de1b67e55843991ed7d0/test/e2e/data/eks/cluster-template-eks-machine-deployment-only.yaml#L32

Like with non-EKS clusters instance profile should not be hard coded but rely on user input so the user is in charge and control of the instances profiles being assigned to different clusters

dlipovetsky commented 1 year ago

In looking at this we would need to modify this to use the new role as well

https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/93897be636031fac812765d95a018fe61dbd689f/pkg/cloud/services/iamauth/reconcile.go#L51

I think there are 2 issues here:

  1. The resource identifier "prefix" (arn::aws::iam) is hard-coded, but the prefix is not global, and does not work in some regions. (There was recent work to address this in https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3926, but this work had to be reverted in https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3982)
  2. The "role name" in the resource identifier is derived from a default. User are allowed to define their own role name, but if they do, this code does not use that name. (For comparison, see this example, showing code that respects the role name defined by the user).
fatz commented 1 year ago

Thanks @dlipovetsky I was accidentally referencing on a test. Thank you for clarifying this.

k8s-triage-robot commented 9 months ago

This issue has not been updated in over 1 year, and should be re-triaged.

You can:

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted