kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.88k stars 1.44k forks source link

setting `hostNetwork` now required with aws vpc cni when using karpenter v1.0 #3818

Open applike-ss opened 3 weeks ago

applike-ss commented 3 weeks ago

Describe the bug With karpenter v1.0 the put hop count on the ec2 node classes are set from 2 to 1, which makes pods unable to use the IMDS.

My assumption is that the comment about hostNetwork which states that it is not needed for aws vpc cni is not correct anymore.

I believe it might need adjustment to:

# Specifies if aws-load-balancer-controller should be started in hostNetwork mode.
# This is required if using a custom CNI where the managed control plane nodes are unable to initiate
# network connections to the pods, for example using Calico CNI plugin on EKS. This is not required or
# recommended if using the Amazon VPC CNI plugin, unless karpenter >= v1.0 is in use.
# With karpenter >= v1.0 in use you have to set your region and vpcId, because it sets your node pools `spec.metadataOptions.httpPutResponseHopLimit=1`).

Let me know if i'm doing something wrong here, however enabling host networking on a pod does not sounds ideal to me. I'd believe that both makes sense:

So I'd suggest the user to set region and vpcId to not have the load balancer controller ask for it via the metadata service.

applike-ss commented 3 weeks ago

refs:

asaf400 commented 2 weeks ago

Just for completeness, and cross-refs links, this is the PR that's affecting IMDS inheritance

asaf400 commented 2 weeks ago

@applike-ss Also note, that you can restore karpenter's previous behavior by just adding: spec.metadataOptions.httpPutResponseHopLimit to the EC2NodeClass, like so:

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: blue
spec:
  metadataOptions:
    httpPutResponseHopLimit: 2

https://karpenter.sh/docs/concepts/nodeclasses/#specmetadataoptions

applike-ss commented 2 weeks ago

@applike-ss Also note, that you can restore karpenter's previous behavior by just adding: spec.metadataOptions.httpPutResponseHopLimit to the EC2NodeClass, like so:

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: blue
spec:
  metadataOptions:
    httpPutResponseHopLimit: 2

https://karpenter.sh/docs/concepts/nodeclasses/#specmetadataoptions

Thank you, i am aware of that. I do wonder now what is the proposed usage. I would like to use the best practice here (which i assume to be hopCount=1 on the karpenter node pool AND not using host networking. If i get it right, then i do have to specify region and vpcId in my case of using aws + karpenter v1 default values. Am I wrong?

asaf400 commented 2 weeks ago

I think it depends on deployment method,

In my case I am using eks with IRSA for karpenter and vpc-cni, this means that the cluster 'automagically' 🧙🪄 injects some AWS environment variables into pods.

Even though I couldn't find an official docs for this behavior, this is an article (found on google) about it: https://medium.com/@samuelbagattin/aws-iam-authentication-for-pods-in-eks-irsa-with-examples-5d8fa16aafba

Search for keyword MutatingWebhookConfiguration You can try and see if your cluster has this via: k get -o yaml MutatingWebhookConfiguration pod-identity-webhook

From the article:

When a pod is created in any namespace, for each container located in the namespace, the webhook creates the following environment variables in the manifest :

    AWS_STS_REGIONAL_ENDPOINTS set to regional by default, tells the SDK to use the current region endpoint to issue STS API calls
    AWS_DEFAULT_REGION and AWS_REGION set to the region in which the cluster is running
    AWS_ROLE_ARN set to the ARN of the IAM role you specified in the eks.amazonaws.com/role-arn service-account annotation
    AWS_WEB_IDENTITY_TOKEN_FILE contains the path where is stored the Kubernetes service account token. This token will be used to get temporary STS credentials (usually set to /var/run/secrets/eks.amazonaws.com/serviceaccount/token)

I have 1 new cluster that required me to manually specify AWS_REGION, but another pre-existing old cluster that doesn't require it, but pods have automatic AWS_REGION..

Edit: found it

applike-ss commented 2 weeks ago

@asaf400 yes, afaik there is two methods of injecting aws authentication variables for pods. One being the traditional IRSA method via pod identity webhook and the newer one being pod identity agent (eks-addon).

I am using the traditional method and thus would also need to specify the region, however I also specified the vpcId.

Do you know if the vpcId is even needed? I did not test it tbh., but i assumed that i would need to specify it to make sure the alb controller does not need to request the IMDS to determine its vpcId for searching for subnets.

asaf400 commented 2 weeks ago

I'm not sure about vpcId in my clusters, but I know we use subnet_discovery tagging

applike-ss commented 1 week ago

I just had a quick peek and in fact the metadata service will be asked for the vpcId when it is not given from external or could not be inferred based on some tag on the vpc.

main.go calls aws.NewCloud, which then calls getVpcID.

This func tries to first infer it based on the name tag (or custom one if given) of the vpc.

If that fails, it is being inferred via the metadata.