terraform-aws-modules / terraform-aws-eks

Terraform module to create Amazon Elastic Kubernetes (EKS) resources 🇺🇦
https://registry.terraform.io/modules/terraform-aws-modules/eks/aws
Apache License 2.0
4.48k stars 4.08k forks source link

When using the irsa getting an issue with OpenIDConnect #965

Closed LouisDinatale closed 4 years ago

LouisDinatale commented 4 years ago

I have issues

I'm submitting a...

What is the current behavior?

Using the irsa with the module has been giving me an error where there is no OpenIDConnect provider found in your account. If I add an OpenIDConnect by hand it works. the only thing i changed was adding my account number and changing the name of the role.

If this is a bug, how to reproduce? Please include a code sample if relevant.

What's the expected behavior?

Are you able to fix this problem and submit a PR? Link here if you have already.

Environment details

Any other relevant info

dpiddockcmp commented 4 years ago

What is the error?

What is the code you're trying to run?

hpio commented 4 years ago

Think I have been having the very same issue, deployed eks with austoscaling enabled and created IRSA but after deploying autoscaler on the gluster I get the following

E0813 16:43:55.729097       1 aws_manager.go:261] Failed to regenerate ASG cache: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/<REDACTED>
        status code: 400, request id: d3971115-8cf9-4478-866c-034a2b27afe4
F0813 16:43:55.729117       1 aws_cloud_provider.go:376] Failed to create AWS Manager: cannot autodiscover ASGs: WebIdentityErr: failed to retrieve credentials
caused by: InvalidIdentityToken: No OpenIDConnect provider found in your account for https://oidc.eks.eu-west-1.amazonaws.com/id/<REDACTED>
        status code: 400, request id: d3971115-8cf9-4478-866c-034a2b27afe4

The code I used:

locals {
  cluster_name                  = "${var.environment}-eks-cluster-${random_string.suffix.result}"
  k8s_service_account_namespace = "kube-system"                               # for IRSA cluster autoscaler
  k8s_service_account_name      = "cluster-autoscaler-aws-cluster-autoscaler" # for IRSA cluster autoscaler
}

# Iam roles for service accounts - to enable cluster austoscaling
module "iam_assumable_role_admin" {
  source                        = "../../modules/iam-assumable-role-with-oidc"
  create_role                   = true
  role_name                     = "cluster-autoscaler"
  provider_url                  = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
  role_policy_arns              = [aws_iam_policy.cluster_autoscaler.arn]
  oidc_fully_qualified_subjects = ["system:serviceaccount:${local.k8s_service_account_namespace}:${local.k8s_service_account_name}"]
}

resource "aws_iam_policy" "cluster_autoscaler" {
  name_prefix = "cluster-autoscaler"
  description = "EKS cluster-autoscaler policy for cluster ${module.eks.cluster_id}"
  policy      = data.aws_iam_policy_document.cluster_autoscaler.json
}

data "aws_iam_policy_document" "cluster_autoscaler" {
  statement {
    sid    = "clusterAutoscalerAll"
    effect = "Allow"

    actions = [
      "autoscaling:DescribeAutoScalingGroups",
      "autoscaling:DescribeAutoScalingInstances",
      "autoscaling:DescribeLaunchConfigurations",
      "autoscaling:DescribeTags",
      "ec2:DescribeLaunchTemplateVersions",
    ]

    resources = ["*"]
  }

  statement {
    sid    = "clusterAutoscalerOwn"
    effect = "Allow"

    actions = [
      "autoscaling:SetDesiredCapacity",
      "autoscaling:TerminateInstanceInAutoScalingGroup",
      "autoscaling:UpdateAutoScalingGroup",
    ]

    resources = ["*"]

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/kubernetes.io/cluster/${module.eks.cluster_id}"
      values   = ["owned"]
    }

    condition {
      test     = "StringEquals"
      variable = "autoscaling:ResourceTag/k8s.io/cluster-autoscaler/enabled"
      values   = ["true"]
    }
  }
}
module "eks" {
  source       = "../../modules/terraform-aws-eks"
  cluster_name = local.cluster_name
  subnets      = module.vpc-eks.private_subnets
  vpc_id       = module.vpc-eks.vpc_id

  cluster_endpoint_private_access      = true
  cluster_endpoint_public_access_cidrs = ["REDACTED"]

  worker_groups = [
    {
      name                = "on-demand-1"
      instance_type       = "m5.large"
      asg_max_size        = 10
      kubelet_extra_args  = "--node-labels=spot=false"
      suspended_processes = ["AZRebalance"]
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${local.cluster_name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        }
      ]
    }
  ]
  worker_groups_launch_template = [
    {
      name                    = "spot-1"
      override_instance_types = ["m5.large", "m5a.large", "m5d.large", "m5ad.large"]
      asg_desired_capacity    = 2
      asg_max_size            = 10
      kubelet_extra_args      = "--node-labels=node.kubernetes.io/lifecycle=spot"
      tags = [
        {
          "key"                 = "k8s.io/cluster-autoscaler/enabled"
          "propagate_at_launch" = "false"
          "value"               = "true"
        },
        {
          "key"                 = "k8s.io/cluster-autoscaler/${local.cluster_name}"
          "propagate_at_launch" = "false"
          "value"               = "true"
        }
      ]
    },
  ]
}
resource "helm_release" "cluster_autoscaler" {
  name       = "cluster-autoscaler"
  repository = "https://kubernetes-charts.storage.googleapis.com"
  chart      = "cluster-autoscaler"
  version    = "7.3.4"
  namespace  = "kube-system"

  set {
    name  = "repository"
    value = "us.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler"
  }

  set {
    name  = "imageTag"
    value = "v1.16.5"
  }

  set {
    name  = "cloudProvider"
    value = "aws"
  }

  set {
    name  = "replicaCount"
    value = "3"
  }

  set {
    name  = "awsRegion"
    value = "eu-west-1"
  }

  set {
    name  = "rbac.create"
    value = "true"
  }

  set {
    name  = "rbac.serviceAccountAnnotations.eks\\.amazonaws\\.com/role-arn"
    value = "arn:aws:iam::${var.aws_account_id}:role/cluster-autoscaler"
    type  = "string"
  }

  set {
    name  = "autoDiscovery.enabled"
    value = "true"
  }

  set {
    name  = "autoDiscovery.clusterName"
    value = "<REDACTED>"
  }
}

Service account on the cluster exists:

kubectl get serviceaccount --all-namespaces | grep cluster-autoscaler-aws-cluster-autoscaler
kube-system       cluster-autoscaler-aws-cluster-autoscaler   1         40m
kubectl describe sa cluster-autoscaler-aws-cluster-autoscaler -n kube-system
Name:                cluster-autoscaler-aws-cluster-autoscaler
Namespace:           kube-system
Labels:              app.kubernetes.io/instance=cluster-autoscaler
                     app.kubernetes.io/managed-by=Helm
                     app.kubernetes.io/name=aws-cluster-autoscaler
                     helm.sh/chart=cluster-autoscaler-7.3.4
Annotations:         eks.amazonaws.com/role-arn: arn:aws:iam::<REDACTED>:role/cluster-autoscaler
                     meta.helm.sh/release-name: cluster-autoscaler
                     meta.helm.sh/release-namespace: kube-system
Image pull secrets:  <none>
Mountable secrets:   cluster-autoscaler-aws-cluster-autoscaler-token-k2chc
Tokens:              cluster-autoscaler-aws-cluster-autoscaler-token-k2chc
Events:              <none>

Can you advise please?

hpio commented 4 years ago

ok, found the issue. For anyone having the same problem you need to configure aws_iam_openid_connect_provider

Lynaj commented 4 years ago

I did run into the same problem and adding aws_iam_openid_connect_provider has solved the issue.

Basically here is the solution: https://registry.terraform.io/providers/hashicorp/tls/latest/docs/data-sources/tls_certificate

dpiddockcmp commented 4 years ago

The module will create the irsa resource if you set enable_irsa = true

barryib commented 4 years ago

Closing this. enable irsa or configure it correctly by your own.

hbceylan commented 3 years ago

If you are using EKS terraform module you just need to enable irsa by adding the following;

enable enable_irsa=true

happy-machine commented 3 years ago

not working for me

rmvangun commented 3 years ago

I also just encountered this issue. I have a service account that's trying to use KMS. I actually had this working at one point. My error is:

Group 0: FAILED
 arn:aws:kms:us-east-2:REDACTED:alias/sandbox-cluster-secrets: FAILED
   - | Error decrypting key: WebIdentityErr: failed to retrieve
     | credentials
     | caused by: InvalidIdentityToken: No OpenIDConnect provider
     | found in your account for
     | https://oidc.eks.us-east-2.amazonaws.com/id/REDACTED
     |     status code: 400, request id:
     | 39ae4989-caea-4496-9898-32302d94148c

I have,

enable_irsa = true

And also added this:

resource "aws_eks_identity_provider_config" "sts" {
  cluster_name = module.eks.cluster_id

  oidc {
    client_id                     = "sts.amazonaws.com"
    identity_provider_config_name = "oidc-sts"
    issuer_url                    = module.eks.cluster_oidc_issuer_url
  }
}

Which configures an STS OIDC provider that I can see on the EKS cluster dashboard.

So I know I had a correct configuration at one point but am now having difficulty troubleshooting. I have verified that the service account has the correct annotation with the role I want it to assume.

Interestingly, this started after I deleted and recreated a node group.

bryrod commented 3 years ago

I also just encountered this issue. I have a service account that's trying to use KMS. I actually had this working at one point. My error is:

Group 0: FAILED
 arn:aws:kms:us-east-2:REDACTED:alias/sandbox-cluster-secrets: FAILED
   - | Error decrypting key: WebIdentityErr: failed to retrieve
     | credentials
     | caused by: InvalidIdentityToken: No OpenIDConnect provider
     | found in your account for
     | https://oidc.eks.us-east-2.amazonaws.com/id/REDACTED
     |     status code: 400, request id:
     | 39ae4989-caea-4496-9898-32302d94148c

I have,

enable_irsa = true

And also added this:

resource "aws_eks_identity_provider_config" "sts" {
  cluster_name = module.eks.cluster_id

  oidc {
    client_id                     = "sts.amazonaws.com"
    identity_provider_config_name = "oidc-sts"
    issuer_url                    = module.eks.cluster_oidc_issuer_url
  }
}

Which configures an STS OIDC provider that I can see on the EKS cluster dashboard.

So I know I had a correct configuration at one point but am now having difficulty troubleshooting. I have verified that the service account has the correct annotation with the role I want it to assume.

Interestingly, this started after I deleted and recreated a node group.

I'm seeing the same issue after a node group rebuild. Wondering if I should just blast away the cluster and rebuild from scratch.

Adding enable_irsa = true built out the aws_iam_openid_connect_provider resource, and I confirmed it through the state file. However, there's nothing listed under the cluster > authentication in management console.

daroga0002 commented 3 years ago

Adding enable_irsa = true built out the aws_iam_openid_connect_provider resource, and I confirmed it through the state file. However, there's nothing listed under the cluster > authentication in management console.

IRSA is not visible under Authentication as this is totally different mechanism. Authentication tab under EKS allow to connect external OIDC to authenticate to kubernetes, where IRSA is used to authenticate service accounts only in AWS IAM.

It will be visible in IAM under: image

IRSA docs: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

Cluster authentication tab in EKS docs: https://docs.aws.amazon.com/eks/latest/userguide/authenticate-oidc-identity-provider.html

github-actions[bot] commented 2 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.