kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.93k stars 1.46k forks source link

403 forbidden when adding TargetGroupBinding #3168

Closed ogreyard closed 1 year ago

ogreyard commented 1 year ago

Describe the bug When adding a TargetGroupBinding to an existing service on an EKS cluster in AWS, the webhook of the alb controller is called, apparently cannot detect the targetgroup info, a 403 error is given.

log output of alb-controller pod after triggering the targetgroupbinding:

{"level":"debug","ts":1682018177.899255,"logger":"mutating_handler","msg":"mutating webhook response","response":{"Patches":null,"uid":"","allowed":false,"status":{"metadata":{},"reason":"unable to get target group IP address type: RequestCanceled: request context canceled\ncaused by: context canceled","code":403}}}
{"level":"debug","ts":1682018177.8995352,"logger":"controller-runtime.webhook.webhooks","msg":"wrote response","webhook":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding","code":403,"reason":"unable to get target group IP address type: RequestCanceled: request context canceled\ncaused by: context canceled","UID":"CLRD-CLRD-CLRD-CLRD-CLRD","allowed":false}

Steps to reproduce

  1. created a cluster (mixed public & private) in a predefined vpc, with predefined subnets and security groups.
  2. added loadbalancer via aws cloudformation
    ALBLoadBalancer:
      Type: AWS::ElasticLoadBalancingV2::LoadBalancer
      Properties:
        Name: alb-load-balancer
        Scheme: internet-facing
        Type: ip
        IpAddressType: ipv4
        SecurityGroups:
          - !Ref ALBLoadBalancerSecurityGroup # replace with your own security group ID
        Subnets:
          - !Ref PublicSubnet01  # replace with your own subnet ID
          - !Ref PublicSubnet02  # replace with your own subnet ID
        Tags:
          - Key: Name
            Value: ALBLoadBalancer
  3. run application Service on public node
  4. add targetgroupbinding
    apiVersion: elbv2.k8s.aws/v1beta1
    kind: TargetGroupBinding
    metadata:
    name: tgb
    spec:
    serviceRef:
    name: frontend
    port: 80
    targetType: ip
    targetGroupARN: ${TG_ARN}

log:

ALB_ARN=arn:aws:elasticloadbalancing:us-east-2:ABC:loadbalancer/app/alb-load-balancer/1234
TargetGroup:  arn:aws:elasticloadbalancing:us-east-2:ABC:targetgroup/ALBTargetGroup/123
Installing Custom Resource Definitions
customresourcedefinition.apiextensions.k8s.io/ingressclassparams.elbv2.k8s.aws configured
customresourcedefinition.apiextensions.k8s.io/targetgroupbindings.elbv2.k8s.aws configured
Applying TargetGroupBinding to cluster...
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "mtargetgroupbinding.elbv2.k8s.aws": failed to call webhook: Post "https://aws-load-balancer-webhook-service.kube-system.svc:443/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding?timeout=10s": context deadline exceeded
{"level":"debug","ts":1682018177.899255,"logger":"mutating_handler","msg":"mutating webhook response","response":{"Patches":null,"uid":"","allowed":false,"status":{"metadata":{},"reason":"unable to get target group IP address type: RequestCanceled: request context canceled\ncaused by: context canceled","code":403}}}
{"level":"debug","ts":1682018177.8995352,"logger":"controller-runtime.webhook.webhooks","msg":"wrote response","webhook":"/mutate-elbv2-k8s-aws-v1beta1-targetgroupbinding","code":403,"reason":"unable to get target group IP address type: RequestCanceled: request context canceled\ncaused by: context canceled","UID":"CLRD-CLRD-CLRD-CLRD-CLRD","allowed":false}

Expected outcome

The IP target is added to the target group without error by the ALB Controller. FYI: Manually adding the target via AWS console is working without hassle.

Environment

Additional Context:

% eksctl get iamidentitymapping --cluster xx
ARN                                             USERNAME                GROUPS                  ACCOUNT
arn:aws:iam::1234:role/eksctl-xx-nodegroup-private-NodeInstanceRole-ABCDE   system:node:{{EC2PrivateDNSName}}   system:bootstrappers,system:nodes   
arn:aws:iam::12345:role/eksctl-xx-nodegroup-public-NodeInstanceRole-ABCD    system:node:{{EC2PrivateDNSName}}   system:bootstrappers,system:nodes
M00nF1sh commented 1 year ago

@ogreyard Did you attached all required IAM permission to the controller? if so, would you mind check the cloudtrail and looking for the event of DescribeTargetGroup for the targetGroup, and see what's the API response.

Our default IAM permission allows to describe all targetGroups

ogreyard commented 1 year ago

Thanks for the reply.

The IAM roles were all correctly configured, I followed the tutorial(s) multiple times.

after spending hours on this, I figured the controller needed to have an STS endpoint reachable within the private clusters to assume the right role. and additionally it required the elb endpoint as well to reach the load balancing api. That‘s really understandable and makes sense, however the error message was a bit misleading/not helpful.

closing, thanks.