Closed kaykhancheckpoint closed 3 years ago
@kaykhancheckpoint, could you verify that the controller pod has the following volume mount and volume configuration injected?
- mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
name: aws-iam-token
readOnly: true
volumes:
- name: aws-iam-token
projected:
defaultMode: 420
sources:
- serviceAccountToken:
audience: sts.amazonaws.com
expirationSeconds: 86400
path: token
Hey i managed to fix this, you can see i set the policy to ec2.amazonaws.com
when it should have been a sts:AssumeRoleWithWebIdentity
role
I was able to do this using a module in terraform iam-assumable-role-with-oidc
.
locals {
k8s_aws_lb_service_account_namespace = "kube-system"
k8s_aws_lb_service_account_name = "aws-load-balancer-controller"
}
resource "aws_iam_policy" "AWSLoadBalancerControllerIAMPolicy" {
name = "AWSLoadBalancerControllerIAMPolicy"
path = "/"
description = "AWS Load Balancer Controller Policy"
policy = file("utils/aws-lb-controller/iam-policy.json")
tags = {
Terraform = "true"
Environment = local.workspace
}
}
module "iam_assumable_role_aws_lb" {
source = "terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc"
version = "3.6.0"
create_role = true
role_name = "AWSLoadBalancerControllerIAMRole"
provider_url = replace(module.eks.cluster_oidc_issuer_url, "https://", "")
role_policy_arns = [aws_iam_policy.AWSLoadBalancerControllerIAMPolicy.arn]
oidc_fully_qualified_subjects = ["system:serviceaccount:${local.k8s_aws_lb_service_account_namespace}:${local.k8s_aws_lb_service_account_name}"]
tags = {
Terraform = "true"
Environment = local.workspace
}
}
I have the exact same issue, I can't figure out what's causing it.
Pod Logs:
{"level":"error","ts":1627148976.691803,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:34703->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149158.8136048,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:52341->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149331.7705815,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:58778->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149528.279761,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:55073->172.20.0.10:53: read: connection refused"}
{"level":"error","ts":1627149707.0748882,"logger":"controller","msg":"Reconciler error","controller":"ingress","name":"hello","namespace":"default","error":"couldn't auto-discover subnets: WebIdentityErr: failed to retrieve credentials\ncaused by: RequestError: send request failed\ncaused by: Post \"https://sts.us-west-2.amazonaws.com/\": dial tcp: lookup sts.us-west-2.amazonaws.com on 172.20.0.10:53: read udp 10.0.3.184:48301->172.20.0.10:53: read: connection refused"}
Container args:
Args:
--cluster-name=app-rylqFOXa
--ingress-class=alb
--aws-region=us-west-2
--aws-vpc-id=vpc-0e200d3ae7e12447c
Role policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::203341958641:oidc-provider/oidc.eks.us-west-2.amazonaws.com/id/2917B2CCF25A5DC470EF1CF5DB059AE9"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-west-2.amazonaws.com/id/2917B2CCF25A5DC470EF1CF5DB059AE9:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"
}
}
}
]
}
The public subnets tagged with:
kubernetes.io/role/elb 1
kubernetes.io/cluster/app-rylqFOXa shared
Private are basically the same, but with internal-elb
. I'm trying to try out fargate as a POC for work. What might I be missing here?
Turns out my issue was related to core dns as described here https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/1360.
A little over a year later, I'm running into this issue. Why would DNS cause this? It seems to resolve just fine, but the assume role policy isn't correct.
And what exactly is the fix here? Is it the fix for #1360, or is it the fix posted by @kaykhancheckpoint.
I'm getting it without using any Fargate profiles. I've tried both (DNS config) as well as the Terraform bit and neither seem to work.
Disregard my other comment. Figured it out.
For anyone else whose Googling lands them here, this is a ready-made drop-in for Terraform which correctly sets up the permissions using a freely available module.
If you find yourself here after many hours of frustration, as I did, note the following:
In my case the first case was the problem.
If you want easy mode and you're using Terraform, this should drop right in:
locals {
kube_system_namespace = "kube-system"
alb_service_account_name = "alb-controller"
efs_service_account_name = "efs-controller"
system_service_accounts = [
"${local.kube_system_namespace}:${local.alb_service_account_name}"
]
}
resource "kubernetes_service_account" "alb" {
metadata {
name = local.alb_service_account_name
namespace = local.kube_system_namespace
labels = {
"app.kubernetes.io/name" = "aws-load-balancer-controller"
"app.kubernetes.io/component" = "controller"
}
annotations = {
"eks.amazonaws.com/role-arn" = module.vpc_cni_irsa.iam_role_arn
}
}
}
module "vpc_cni_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "~> 4.12"
role_name_prefix = "vpc-cni-irsa-"
attach_load_balancer_controller_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = local.system_service_accounts
}
}
}
constantly getting the following output after checking the logs of aws-load-balancer-controller:
{"level":"error","ts":1657324768.868449,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-default-backends-6d61d3952a","namespace":"default","error":"WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 5c3f9ce5-ba7d-495e-98b1-ffcd5cf85133"}
{"level":"error","ts":1657324768.8749876,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-default-backends-83e7be3ef9","namespace":"default","error":"WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: bddf9d23-4126-4b61-aaf9-c1ba1ecc8ed4"}
{"level":"error","ts":1657324768.8810706,"logger":"controller-runtime.manager.controller.targetGroupBinding","msg":"Reconciler error","reconciler group":"elbv2.k8s.aws","reconciler kind":"TargetGroupBinding","name":"k8s-default-flowerse-82587b6137","namespace":"default","error":"WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: d09f8725-db5b-4d47-9443-b627d4f8a8c8"}
{"level":"error","ts":1657324781.8112953,"logger":"controller-runtime.manager.controller.ingress","msg":"Reconciler error","name":"backend-ingress","namespace":"default","error":"WebIdentityErr: failed to retrieve credentials\ncaused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n\tstatus code: 403, request id: 54e2e5b2-103f-4a25-b544-7cccd739a560"}
kubectl describe ingress ingress_name
shows this :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedBuildModel 2m46s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 8d30a0d7-1c0c-4890-b78d-eca678982f86
Warning FailedBuildModel 2m46s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: aa4873bb-2c96-4491-b506-5a6011bd2a35
Warning FailedBuildModel 2m46s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 811042d9-f1e6-4131-b722-6fb62dc2439c
Warning FailedBuildModel 2m45s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 0540cd6a-910b-4cef-8d31-424d0f2de3e1
Warning FailedBuildModel 2m45s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: d6e4e2bb-56dc-4075-b597-c75f6d97547a
Warning FailedBuildModel 2m45s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: 8a9897c9-116f-4ccc-ba1f-123fa2c4e76c
Warning FailedBuildModel 2m45s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
caused by: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
status code: 403, request id: b8d22508-6a88-4c03-bcf3-4200a7b34c50
Warning FailedBuildModel 2m45s ingress Failed build model due to WebIdentityErr: failed to retrieve credentials
my role policy is :
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated":
"arn:aws:iam::*****:oidc-provider/oidc.eks.ap-southeast-1.amazonaws.com/id/*********"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.ap-southeast-1.amazonaws.com/id/*****":
"sts.amazonaws.com",
"oidc.eks.ap-southeast-1.amazonaws.com/id/*****":
"system:serviceaccount:kube-system:aws-load-balancer-controller"
}
}
}
]
}
I am using the helm chart to install the aws load balancer controller.
https://github.com/aws/eks-charts/tree/master/stable/aws-load-balancer-controller
However when i apply the ingress controller i get the following error:
It looks like it is missing a permission, but the role i have created has the correct policy attached https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.1.2/docs/install/iam_policy.json
Can you check below if i am creating the correct role? as i was unsure about this bit
values.yml
role creation