kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.84k stars 1.42k forks source link

│ Error: Failed to create Ingress 'prod/htd-ingress' because: Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource #2039

Closed dilip-joveo closed 3 years ago

dilip-joveo commented 3 years ago

Have deployed aws-load-balancer controller using the helm chart in eks cluster while creating the ingress of the of the service getting the error │ Error: Failed to create Ingress 'prod/htd-ingress' because: Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource

mikebevz commented 3 years ago

I've got the same problem, EKS 1.20.

Error: failed to create resource: Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource

blackdog0403 commented 3 years ago

I've got the same problem as well EKS 1.19 , ALB Load Balancer Contoroller v2.1.3

Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": the server could not find the requested resource

bmwant commented 3 years ago

I've solved this issue by reapplying service account on a cluster - might be unrelated but you can give it a try

$ eksctl utils associate-iam-oidc-provider \
    --region "us-east-1" \
    --cluster "prod-cluster-name" \
    --approve
$ curl -o iam-policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.0/docs/install/iam_policy.json
$ aws iam create-policy \
    --policy-name AWSLoadBalancerControllerIAMPolicy \
    --policy-document file://iam-policy.json

$ eksctl delete iamserviceaccount --cluster "prod-cluster-name" --name=aws-load-balancer-controller
$ eksctl create iamserviceaccount \
    --cluster="prod-cluster-name" \
    --namespace=default \
    --name=aws-load-balancer-controller \
    --attach-policy-arn="arn:aws:iam::457398059321:policy/AWSLoadBalancerControllerIAMPolicy" \
    --override-existing-serviceaccounts \
    --approve

also replace namespace to kube-system if needed.

blackdog0403 commented 3 years ago

I've solved this issue by reapplying service account on a cluster - might be unrelated but you can give it a try

$ eksctl utils associate-iam-oidc-provider \
    --region "us-east-1" \
    --cluster "prod-cluster-name" \
    --approve
$ curl -o iam-policy.json https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.2.0/docs/install/iam_policy.json
$ aws iam create-policy \
    --policy-name AWSLoadBalancerControllerIAMPolicy \
    --policy-document file://iam-policy.json

$ eksctl delete iamserviceaccount --cluster "prod-cluster-name" --name=aws-load-balancer-controller
$ eksctl create iamserviceaccount \
    --cluster="prod-cluster-name" \
    --namespace=default \
    --name=aws-load-balancer-controller \
    --attach-policy-arn="arn:aws:iam::457398059321:policy/AWSLoadBalancerControllerIAMPolicy" \
    --override-existing-serviceaccounts \
    --approve

also replace namespace to kube-system if needed.

It's helpful to solve problem! Just overriding existing service account does not work.

It's working after deleted iam service account and recreated it.

giannisbetas commented 3 years ago

Same issue on EKS 1.18 when updating from helm chart version 1.0.7 to 1.2.0.

I updated the CRD manually k apply -f https://raw.githubusercontent.com/aws/eks-charts/v0.0.51/stable/aws-load-balancer-controller/crds/crds.yaml but that doesn't solve the issue. Deleting and recreating the service account doesn't help either.

Has anyone figured out the root cause of this issue?

kishorj commented 3 years ago

@giannisbetas, do you see any errors in the controller logs? For v2.2.0, you'd need to update the CRDs, and apply additional IAM permissions.

giannisbetas commented 3 years ago

@kishorj I updated the IAM policy and the CRDs. I could not see any errors in the logs of the controller. I will attempt to upgrade to v2.2.0 in EKS 1.19 and see whether I see the same behaviour.

dilip-joveo commented 3 years ago

able to solve the issue by redeploying the serviceaccount thanks for the help.

kaykhancheckpoint commented 3 years ago

@dilip-joveo I'm facing the same proble in eks 1.19, how did you redeploy the serviceaccount?

In my case i am using helm and the chart creates the service account

serviceAccount:
  # Specifies whether a service account should be created
  create: true
  # Annotations to add to the service account
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::<redacted>:role/AWSLoadBalancerControllerIAMRole
  # The name of the service account to use.
  # If not set and create is true, a name is generated using the fullname template
  name: "aws-load-balancer-controller"
  # Automount API credentials for a Service Account.
  automountServiceAccountToken: true
shimont commented 2 years ago

Following this official AWS guide helped me https://aws.amazon.com/premiumsupport/knowledge-center/eks-alb-ingress-controller-fargate/

GrigorievNick commented 2 years ago

I just have the same issue, with 1.21 eks. What strange it's appeared when I migrate cluster from one Account to another. In the previous account, the same receipt works perfectly. I use helm + terraform to instal alb.

resource "helm_release" "alb_controller" {
  depends_on       = [module.eks, aws_ec2_tag.vpc_tag, aws_ec2_tag.private_subnet_cluster_tag]
  name             = local.eks_cluster_name // check if I need any
  repository       = "https://aws.github.io/eks-charts"
  chart            = "aws-load-balancer-controller"
  version          = "v1.3.3" // alb controller 2.3.1
  namespace        = kubernetes_namespace.alb_controller.id
  create_namespace = false
  atomic           = true
  wait             = true
  wait_for_jobs    = true
  timeout          = 900

  set {
    name  = "clusterName"
    value = local.eks_cluster_name
  }
  set {
    name  = "serviceAccount.create"
    value = false
  }
  set {
    name  = "serviceAccount.name"
    value = kubernetes_service_account.alb_controller.metadata[0].name
  }
  set {
    name  = "logLevel"
    value = "info"
  }
#  set {
#    name  = "keepTLSSecret"
#    value = true
#  }
#  set {
#    name  = "enableCertManager"
#    value = true
#  }

  values = [
    <<EOT
defaultTags:
%{ for key, val in local.alb_controller.tags }
  ${key}: ${val}
%{ endfor }
EOT
  ]
}

error

│ Error: Failed to create Ingress 'dremio/dremio-client-in-private-alb' because: Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": Post "https://aws-load-balancer-webhook-service.alb-controller.svc:443/validate-networking-v1beta1-ingress?timeout=10s": context deadline exceeded
Parziphal commented 2 years ago

Following this official AWS guide helped me https://aws.amazon.com/premiumsupport/knowledge-center/eks-alb-ingress-controller-fargate/

I followed like 5 guides, all from Amazon, but none worked, they either lacked information or were outdated. This one worked! Finally!

mmtechslv commented 2 years ago

It can be caused by the updated iam_policy.json file. If you are using local/downloaded policies make sure they are up to date with the latest version of aws-load-balancer-controller.

eocern commented 2 years ago

In case it might help others - I also had the original issue using fargate profile and worker-node for core-dns. The solution for me I found in another place was just adding

node_security_group_additional_rules = {
    ingress_allow_access_from_control_plane = {
      type                          = "ingress"
      protocol                      = "tcp"
      from_port                     = 9443
      to_port                       = 9443
      source_cluster_security_group = true
      description                   = "Allow access from control plane to webhook port of AWS load balancer controller"
    }
  }
dai-mk commented 2 years ago

In case it might help others - I also had the original issue using fargate profile and worker-node for core-dns. The solution for me I found in another place was just adding

node_security_group_additional_rules = {
    ingress_allow_access_from_control_plane = {
      type                          = "ingress"
      protocol                      = "tcp"
      from_port                     = 9443
      to_port                       = 9443
      source_cluster_security_group = true
      description                   = "Allow access from control plane to webhook port of AWS load balancer controller"
    }
  }

Thanks @eocern . Saved my day. :)

A hint for people from the future struggeling to find why they get failed calling webhook from other projects => this should solve them too.

ShonL commented 2 years ago

For me, I had to open TCP 9443 port from the control plane to the nodes

simplicbe commented 2 years ago

For those who are using AWS management interface (web), go to the security group that is attached to the EC2 instances (nodes) and enable Port 9443 / TCP. That is the same as mentioned in the terraform code above:

image

Thanks for the help!

goldnetonline commented 1 year ago

In case it might help others - I also had the original issue using fargate profile and worker-node for core-dns. The solution for me I found in another place was just adding

node_security_group_additional_rules = {
    ingress_allow_access_from_control_plane = {
      type                          = "ingress"
      protocol                      = "tcp"
      from_port                     = 9443
      to_port                       = 9443
      source_cluster_security_group = true
      description                   = "Allow access from control plane to webhook port of AWS load balancer controller"
    }
  }

Thanks @eocern this is the solution for me

rafaellabegalini-vwi commented 1 year ago

In case it might help others - I also had the original issue using fargate profile and worker-node for core-dns. The solution for me I found in another place was just adding

node_security_group_additional_rules = {
    ingress_allow_access_from_control_plane = {
      type                          = "ingress"
      protocol                      = "tcp"
      from_port                     = 9443
      to_port                       = 9443
      source_cluster_security_group = true
      description                   = "Allow access from control plane to webhook port of AWS load balancer controller"
    }
  }

Thanks @eocern, this solution works for me too =)

zhukovsd commented 1 year ago

The same problem is possible when using recent AWS Load Balancer Controller Docker images (2.4.x) with an outdated Helm chart.

My setup:

In this setup, I get the same problem which is discussed in this thread: Error: Failed to create Ingress 'prod/htd-ingress' because: Internal error occurred: failed calling webhook

Solution - update CDK to more recent version which deploys AWS Load Balancer Controller using more recent Helm chart.

AfhaamANJ commented 9 months ago

For me, I had to open TCP 9443 port from the control plane to the nodes

Where I need to add this ?

simplicbe commented 9 months ago

For me, I had to open TCP 9443 port from the control plane to the nodes

Where I need to add this ?

Take a look here: https://github.com/kubernetes-sigs/aws-load-balancer-controller/issues/2039#issuecomment-1189524910

AfhaamANJ commented 9 months ago

For those who are using AWS management interface (web), go to the security group that is attached to the EC2 instances (nodes) and enable Port 9443 / TCP. That is the same as mentioned in the terraform code above:

image

Thanks for the help!

it will be better if this is in english