Pod failing to get IP address assigned

kstevensonnv commented 1 year ago

Version

Karpenter Version: 0.28.0-rc.2

Kubernetes Version: v1.26

Expected Behavior

Raised under this now closed issue: Karpenter is not aware of the Custom Networking VPC CNI pod limit per node

@bwagner5 https://github.com/aws/karpenter/issues/2273#issuecomment-1551853677

3516 should address this and will be released in v0.28.0

Running 0.28.0-rc.2 with the option aws.reservedENIs set to 1 as described. Pods should be assigned an IP address.

Actual Behavior

The pod fails to get assigned an IP address and does not start.

Steps to Reproduce the Problem

Scale a deployment beyond what current nodes can handle to ensure a new node is provisioned by Karpenter.

Resource Specs and Logs

❯ kubectl -n karpenter describe cm karpenter-global-settings | grep reserved -A 2
aws.reservedENIs:
----
1

❯ kubectl scale deployment ahoy-hello-world --replicas 250
deployment.apps/ahoy-hello-world scaled

2023-06-02T18:52:54.220Z        INFO    controller.provisioner  found provisionable pod(s)      {"commit": "2be6616-dirty", "pods": 45}
2023-06-02T18:52:54.221Z        INFO    controller.provisioner  computed new machine(s) to fit pod(s)   {"commit": "2be6616-dirty", "machines": 1, "pods": 45}
2023-06-02T18:52:54.234Z        INFO    controller.provisioner  created machine {"commit": "2be6616-dirty", "provisioner": "default", "requests": {"cpu":"155m","memory":"120Mi","pods":"48"}, "instance-types": "c3.large, c3.xlarge, c4.large, c4.xlarge, c5.2xlarge and 95 other(s)"}
2023-06-02T18:52:54.434Z        DEBUG   controller.machine_lifecycle    discovered subnets      {"commit": "2be6616-dirty", "machine": "default-h2rdv", "provisioner": "default", "subnets": ["subnet-034313c968a897f9a (eu-west-1a)", "subnet-0c4a2e5d88292ec83 (eu-west-1b)", "subnet-0de7acfef28444117 (eu-west-1c)"]}
2023-06-02T18:52:54.998Z        DEBUG   controller.machine_lifecycle    created launch template {"commit": "2be6616-dirty", "machine": "default-h2rdv", "provisioner": "default", "launch-template-name": "karpenter.k8s.aws/16237060489344945276", "id": "lt-00f8dc861d6b9fb24"}
2023-06-02T18:52:57.094Z        INFO    controller.machine_lifecycle    launched machine        {"commit": "2be6616-dirty", "machine": "default-h2rdv", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-$instance_id", "instance-type": "m3.medium", "zone": "eu-west-1c", "capacity-type": "spot", "allocatable": {"cpu":"940m","ephemeral-storage":"17Gi","memory":"1987Mi","pods":"110"}}
2023-06-02T18:53:24.421Z        DEBUG   controller.machine_lifecycle    registered machine      {"commit": "2be6616-dirty", "machine": "default-h2rdv", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-$instance_id", "node": "ip-10-3-9-89.eu-west-1.compute.internal"}
2023-06-02T18:53:47.958Z        DEBUG   controller.machine_lifecycle    initialized machine     {"commit": "2be6616-dirty", "machine": "default-h2rdv", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-$instance_id", "node": "ip-10-3-9-89.eu-west-1.compute.internal"}
2023-06-02T18:54:01.236Z        DEBUG   controller.deprovisioning       discovered instance types       {"commit": "2be6616-dirty", "count": 622}
2023-06-02T18:54:01.421Z        DEBUG   controller.deprovisioning       discovered subnets      {"commit": "2be6616-dirty", "subnets": ["subnet-034313c968a897f9a (eu-west-1a)", "subnet-0c4a2e5d88292ec83 (eu-west-1b)", "subnet-0de7acfef28444117 (eu-west-1c)"]}

❯ kubectl get pods | grep ContainerCreating | wc -l
     198

❯ kubectl describe pod ahoy-hello-world-79f545d68-zz5tx | grep Events -A 100
Events:
  Type     Reason                  Age                  From               Message
  ----     ------                  ----                 ----               -------
  Normal   Scheduled               5m33s                default-scheduler  Successfully assigned test/ahoy-hello-world-79f545d68-zz5tx to ip-10-3-1-2.eu-west-1.compute.internal
  Warning  FailedCreatePodSandBox  5m31s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0500e5cfbe2f354d7ffbcf262057ed7c2316f6e9464a75e43ccf399526f631b1": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  5m16s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d05f4c6fc93373b69f86719028dab88974b45799f65730313728b214e45b3207": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  5m2s                 kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "87850f9ddf22ee77e24d2e26c5e98366b52f9f48b32e2bb1458a2a2174cd49a4": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  4m50s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6f07794c04ae201bcdf4469feecc53f4078dfa0f9df48edd1ff71366205f12e6": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  4m37s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "bc60fed3c06f3455e8e915d9b45934d98a25305c62658c7a5c36ff1139bdc118": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  4m22s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "76fb8238fa20331638974695eac55c736aafb41a7564f32e91bbbcf17f3ffe4f": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  4m10s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8bf3110ea8ae3e28e0f649bdaa8e787b19666a88420c9c1084d1697c11c09ce7": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  3m58s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f0696e2b70f9ce38b9c80fe29f53955468d28591f8ee731bc5bc2e9bc6d8e33b": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  3m44s                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e3dec9ede940f9f4a14a1cfd2ca64f7b372f8e8b8914ea730970b2d5bef5ebae": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  0s (x11 over 2m50s)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "956caa82c69220f2a2c9b84f998e72bf3a7d8d61db3699f4e7cb197f138c728a": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

bwagner5 commented 1 year ago

aws.reservedENIs doesn't actually allocate any IPs. It gives you a way to tell Karpenter that you have configure the CNI you are using to assign an IP to the node from a different place than normal or to reserve it for some other use. If you're using VPC CNI, then you'd use this config https://github.com/aws/amazon-vpc-cni-k8s#aws_vpc_k8s_cni_custom_network_cfg.

Do you have a CNI that is configured to assign IPs from appropriately sized subnets?

kstevensonnv commented 1 year ago

Hi @bwagner5,

Can someone please have a look at the information below and point me in the right direction?

All pods scheduled on the new node launched by Karpenter are running as expected.

The pods scheduled on the initial nodes launched when the cluster was created have reached the limit of 29 for the instance type according to eni-max-pods.txt

The instance type used for the initial nodes is m7g.large

Running the max-pods-calculator.sh script shows without prefix delegation enabled the maximum limit for pods on that instance type is 29.

./max-pods-calculator.sh --instance-type m7g.large --cni-version 1.12.5-eksbuild.2
29

This matches up with the number of pods entering a 'Running' state.

kubectl get nodes
NAME                                       STATUS   ROLES    AGE     VERSION
ip-10-1-6-216.eu-west-1.compute.internal   Ready    <none>   3h33m   v1.26.4-eks-597964d
ip-10-1-8-251.eu-west-1.compute.internal   Ready    <none>   3h33m   v1.26.4-eks-597964d
ip-10-1-8-37.eu-west-1.compute.internal    Ready    <none>   8m7s    v1.26.4-eks-597964d

kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-6-216.eu-west-1.compute.internal | wc -l
     111

kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-6-216.eu-west-1.compute.internal | grep Running | wc -l
      29

kubectl get pods -A -owide --field-selector spec.nodeName=ip-10-1-8-251.eu-west-1.compute.internal | grep Running | wc -l
      29

Running the same script shows with prefix delegation enabled the maximum limit for pods on that instance type is 110.

./max-pods-calculator.sh --instance-type m7g.large --cni-version 1.12.5-eksbuild.2 --cni-prefix-delegation-enabled
110

This matches up with the 'Allocatable' section when describing each of the initial nodes, 'pods: 110'

Scaling nodes to 0 and launching new nodes exhibits the same issue, when a node of that instance type reaches 29 allocated IPv4 addresses, all new pods get stuck on the 'ContainerCreating' status.

As below, 'ENABLE_PREFIX_DELEGATION' is enabled and 'WARM_PREFIX_TARGET' is set for the VPC CNI.

Shouldn't this configuration allow all nodes to use prefix delegation and successfully schedule pods beyond the usual limit?

I deployed a new VPC and EKS cluster, there's nothing else running in either.

VPC configuration: VPC CIDR	10.1.0.0/19
AWS Region	eu-west-1
Addressable Hosts	8178
Spare Capacity	2046

Subnet Type	Subnet address	Range of addresses	Useable IPs	Hosts
Public	10.1.0.0/22	10.1.0.0 - 10.1.3.255	10.1.0.1 - 10.1.3.254	1022
Public	10.1.4.0/22	10.1.4.0 - 10.1.7.255	10.1.4.1 - 10.1.7.254	1022
Public	10.1.8.0/22	10.1.8.0 - 10.1.11.255	10.1.8.1 - 10.1.11.254	1022
Private	10.1.12.0/22	10.1.12.0 - 10.1.15.255	10.1.12.1 - 10.1.15.254	1022
Private	10.1.16.0/22	10.1.16.0 - 10.1.19.255	10.1.16.1 - 10.1.19.254	1022
Private	10.1.20.0/22	10.1.20.0 - 10.1.23.255	10.1.20.1 - 10.1.23.254	1022
Spare	10.1.24.0/21	10.1.24.0 - 10.1.31.255	10.1.24.1 - 10.1.31.254	2046

Kubernetes nodes:	Name	Status	Version	IP
ip-10-1-6-216.eu-west-1.compute.internal	Ready	v1.26.4-eks-597964d	10.1.6.216
ip-10-1-8-251.eu-west-1.compute.internal	Ready	v1.26.4-eks-597964d	10.1.8.251

Subnet 10.1.0.0/22 with node 'ip-10-1-6-216.eu-west-1.compute.internal' has 998 available IPv4 addresses. Subnet 10.1.8.0/22 with node 'ip-10-1-8-251.eu-west-1.compute.internal' has 999 available IPv4 addresses.

I'm using the Amazon VPC CNI plugin for Kubernetes Amazon EKS add-on.

I have added these configuration values:

{
  "env":{
    "ENABLE_PREFIX_DELEGATION":"true",
    "WARM_PREFIX_TARGET":"1"
  }
}

I can see the configuration values are applied:

kubectl describe node | grep Allocatable -A 10 | grep pods
  pods:               110
  pods:               110

kubectl -n kube-system describe pod aws-node-77sh5 | grep -i prefix
      ENABLE_PREFIX_DELEGATION:               true
      WARM_PREFIX_TARGET:                     1

kubectl -n kube-system describe pod aws-node-sv8qb | grep -i prefix
      ENABLE_PREFIX_DELEGATION:               true
      WARM_PREFIX_TARGET:                     1

I have one pod running in a test namespace:

kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
ahoy-hello-world-79f545d68-zp6gm   1/1     Running   0          18m

AWSNodeTemplate:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: default
spec:
  subnetSelector:
    karpenter.sh/discovery: ${cluster_name}
  securityGroupSelector:
    karpenter.sh/discovery: ${cluster_name}
  tags:
    karpenter.sh/discovery: ${cluster_name}
  amiFamily: Bottlerocket

Provisioner:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: default
spec:
  requirements:
    - key: karpenter.sh/capacity-type
      operator: In
      values: ["spot"]
  limits:
    resources:
      cpu: 1000
  providerRef:
    name: default
  ttlSecondsAfterEmpty: 30

Karpenter logs are uneventful.

Karpenter logs

``` ❯ kubectl logs karpenter-686b98766d-9xdqq -f 2023-06-06T02:34:02.757Z DEBUG Successfully created the logger. 2023-06-06T02:34:02.757Z DEBUG Logging level set to: debug 2023-06-06T02:34:05.434Z DEBUG controller discovered region {"commit": "2be6616-dirty", "region": "eu-west-1"} 2023-06-06T02:34:05.434Z DEBUG controller discovered cluster endpoint {"commit": "2be6616-dirty", "cluster-endpoint": "https://endpoint.sk1.eu-west-1.eks.amazonaws.com"} 2023-06-06T02:34:05.438Z DEBUG controller discovered kube dns {"commit": "2be6616-dirty", "kube-dns-ip": "172.20.0.10"} 2023-06-06T02:34:05.438Z DEBUG controller discovered version {"commit": "2be6616-dirty", "version": "v0.28.0-rc.2"} 2023-06-06T02:34:05.439Z DEBUG controller Registering 2 clients {"commit": "2be6616-dirty"} 2023-06-06T02:34:05.439Z DEBUG controller Registering 2 informer factories {"commit": "2be6616-dirty"} 2023-06-06T02:34:05.439Z DEBUG controller Registering 3 informers {"commit": "2be6616-dirty"} 2023-06-06T02:34:05.439Z DEBUG controller Registering 5 controllers {"commit": "2be6616-dirty"} 2023-06-06T02:34:05.439Z INFO controller Starting server {"commit": "2be6616-dirty", "path": "/metrics", "kind": "metrics", "addr": "[::]:8000"} 2023-06-06T02:34:05.439Z INFO controller Starting server {"commit": "2be6616-dirty", "kind": "health probe", "addr": "[::]:8081"} 2023-06-06T02:34:05.540Z INFO controller attempting to acquire leader lease karpenter/karpenter-leader-election... {"commit": "2be6616-dirty"} 2023-06-06T02:34:05.558Z INFO controller Starting informers... {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.642Z INFO controller successfully acquired lease karpenter/karpenter-leader-election {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.642Z INFO controller.provisioner starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.642Z INFO controller.deprovisioning starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.642Z DEBUG controller.deprovisioning waiting on cluster sync {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.642Z INFO controller.metric_scraper starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet", "source": "kind source: *v1.DaemonSet"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: *v1.Pod"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: *v1.Pod"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "machine-state", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1alpha5.Machine"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1alpha5.Provisioner"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Pod"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node"} 2023-06-06T02:34:34.643Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "machine-state", "controllerGroup": "karpenter.sh", "controllerKind": "Machine"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: *v1alpha5.Provisioner"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod", "source": "kind source: *v1.Pod"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "consistency", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1alpha5.Machine"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "consistency", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "consistency", "controllerGroup": "karpenter.sh", "controllerKind": "Machine"} 2023-06-06T02:34:34.643Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: *v1alpha5.Provisioner"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: *v1alpha5.Provisioner"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner"} 2023-06-06T02:34:34.644Z INFO controller.machine_garbagecollection starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "machine_lifecycle", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1alpha5.Machine"} 2023-06-06T02:34:34.644Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "machine_lifecycle", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.644Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "machine_lifecycle", "controllerGroup": "karpenter.sh", "controllerKind": "Machine"} 2023-06-06T02:34:34.645Z INFO controller.machine_garbagecollection starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.645Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate", "source": "kind source: *v1alpha1.AWSNodeTemplate"} 2023-06-06T02:34:34.645Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate"} 2023-06-06T02:34:34.645Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "machine_termination", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1alpha5.Machine"} 2023-06-06T02:34:34.645Z INFO controller Starting EventSource {"commit": "2be6616-dirty", "controller": "machine_termination", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "source": "kind source: *v1.Node"} 2023-06-06T02:34:34.645Z INFO controller Starting Controller {"commit": "2be6616-dirty", "controller": "machine_termination", "controllerGroup": "karpenter.sh", "controllerKind": "Machine"} 2023-06-06T02:34:34.645Z INFO controller.machine_link starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.645Z INFO controller.pricing starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.645Z INFO controller.interruption starting controller {"commit": "2be6616-dirty"} 2023-06-06T02:34:34.645Z DEBUG controller.interruption watching interruption queue {"commit": "2be6616-dirty", "queue": "Karpenter-demo9"} 2023-06-06T02:34:34.705Z DEBUG controller hydrated launch template cache {"commit": "2be6616-dirty", "tag-key": "karpenter.k8s.aws/cluster", "tag-value": "demo9", "count": 0} 2023-06-06T02:34:34.743Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "pod_state", "controllerGroup": "", "controllerKind": "Pod", "worker count": 10} 2023-06-06T02:34:34.743Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "node_state", "controllerGroup": "", "controllerKind": "Node", "worker count": 10} 2023-06-06T02:34:34.743Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "daemonset", "controllerGroup": "apps", "controllerKind": "DaemonSet", "worker count": 10} 2023-06-06T02:34:34.743Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "termination", "controllerGroup": "", "controllerKind": "Node", "worker count": 100} 2023-06-06T02:34:34.744Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "pod_metrics", "controllerGroup": "", "controllerKind": "Pod", "worker count": 1} 2023-06-06T02:34:34.744Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "machine-state", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "worker count": 10} 2023-06-06T02:34:34.743Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "provisioner_trigger", "controllerGroup": "", "controllerKind": "Pod", "worker count": 10} 2023-06-06T02:34:34.745Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "machine_lifecycle", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "worker count": 1000} 2023-06-06T02:34:34.749Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "node", "controllerGroup": "", "controllerKind": "Node", "worker count": 10} 2023-06-06T02:34:34.750Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "machine_termination", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "worker count": 100} 2023-06-06T02:34:34.744Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "consistency", "controllerGroup": "karpenter.sh", "controllerKind": "Machine", "worker count": 10} 2023-06-06T02:34:34.844Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "provisioner_state", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10} 2023-06-06T02:34:34.844Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "provisioner_metrics", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 1} 2023-06-06T02:34:34.850Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "awsnodetemplate", "controllerGroup": "karpenter.k8s.aws", "controllerKind": "AWSNodeTemplate", "worker count": 10} 2023-06-06T02:34:34.852Z INFO controller Starting workers {"commit": "2be6616-dirty", "controller": "counter", "controllerGroup": "karpenter.sh", "controllerKind": "Provisioner", "worker count": 10} 2023-06-06T02:34:34.890Z DEBUG controller.awsnodetemplate discovered subnets {"commit": "2be6616-dirty", "awsnodetemplate": "default", "subnets": ["subnet-0ab64e974252c4e3d (eu-west-1c)", "subnet-0ea7f03a13ffa5ded (eu-west-1a)", "subnet-045cf720eaf23a71a (eu-west-1b)"]} 2023-06-06T02:34:34.910Z INFO controller.pricing updated spot pricing with instance types and offerings {"commit": "2be6616-dirty", "instance-type-count": 642, "offering-count": 1805} 2023-06-06T02:34:34.933Z DEBUG controller.awsnodetemplate discovered security groups {"commit": "2be6616-dirty", "awsnodetemplate": "default", "security-groups": ["sg-0ad21b0f5ad50723a"]} 2023-06-06T02:34:34.934Z DEBUG controller.awsnodetemplate discovered kubernetes version {"commit": "2be6616-dirty", "awsnodetemplate": "default", "version": "1.26"} 2023-06-06T02:34:35.045Z ERROR controller.node_state PersistentVolume source 'AWSElasticBlockStore' uses an in-tree storage plugin which is unsupported by Karpenter and is deprecated by Kubernetes. Scale-ups may fail because Karpenter will not discover driver limits. Use a PersistentVolume that references the 'CSI' volume source for Karpenter auto-scaling support. {"commit": "2be6616-dirty", "node": "ip-10-1-8-251.eu-west-1.compute.internal", "pod": "cost-analyzer-prometheus-server-56bdd9bc94-h6jzn", "volume": "config-volume", "volume": "storage-volume", "persistent-volume": "pvc-1ea59872-07f1-4a38-8a2b-defc50faa77e"} 2023-06-06T02:34:35.045Z ERROR controller.node_state StorageClass .spec.provisioner uses an in-tree storage plugin which is unsupported by Karpenter and is deprecated by Kubernetes. Scale-ups may fail because Karpenter will not discover driver limits. Create a new StorageClass with a .spec.provisioner referencing the CSI driver plugin name 'ebs.csi.aws.com'. {"commit": "2be6616-dirty", "node": "ip-10-1-8-251.eu-west-1.compute.internal", "pod": "cost-analyzer-prometheus-server-56bdd9bc94-h6jzn", "volume": "config-volume", "volume": "storage-volume", "storage-class": "gp2", "provisioner": "kubernetes.io/aws-ebs"} 2023-06-06T02:34:35.045Z ERROR controller.node_state PersistentVolume source 'AWSElasticBlockStore' uses an in-tree storage plugin which is unsupported by Karpenter and is deprecated by Kubernetes. Scale-ups may fail because Karpenter will not discover driver limits. Use a PersistentVolume that references the 'CSI' volume source for Karpenter auto-scaling support. {"commit": "2be6616-dirty", "node": "ip-10-1-8-251.eu-west-1.compute.internal", "pod": "cost-analyzer-8864bdff8-mwj4k", "volume": "tmp", "volume": "nginx-conf", "volume": "persistent-configs", "persistent-volume": "pvc-caba4b0c-d853-4415-99ae-517d19335527"} 2023-06-06T02:34:35.430Z DEBUG controller.awsnodetemplate discovered amis {"commit": "2be6616-dirty", "awsnodetemplate": "default", "ids": "ami-077ae7ace3bbba6fd, ami-05c44b4dc9f0ccf07, ami-05c44b4dc9f0ccf07, ami-0ae44cb1ba673fc31, ami-0ae44cb1ba673fc31, ami-0674532ce2b29c8e4", "count": 6} 2023-06-06T02:34:36.424Z DEBUG controller.deprovisioning discovered instance types {"commit": "2be6616-dirty", "count": 622} 2023-06-06T02:34:36.533Z DEBUG controller.deprovisioning discovered offerings for instance types {"commit": "2be6616-dirty", "zones": ["eu-west-1a", "eu-west-1b", "eu-west-1c"], "instance-type-count": 624, "node-template": "default"} 2023-06-06T02:34:36.891Z INFO controller.pricing updated on-demand pricing {"commit": "2be6616-dirty", "instance-type-count": 625} ```

I scaled up a deployment to 250 pods. Karpenter adds a new node successfully and pods get scheduled on it.

Karpenter logs - after scaling

``` 2023-06-06T02:36:28.598Z INFO controller.provisioner found provisionable pod(s) {"commit": "2be6616-dirty", "pods": 45} 2023-06-06T02:36:28.598Z INFO controller.provisioner computed new machine(s) to fit pod(s) {"commit": "2be6616-dirty", "machines": 1, "pods": 45} 2023-06-06T02:36:28.610Z INFO controller.provisioner created machine {"commit": "2be6616-dirty", "provisioner": "default", "requests": {"cpu":"155m","memory":"120Mi","pods":"48"}, "instance-types": "c3.4xlarge, c3.8xlarge, c4.4xlarge, c4.8xlarge, c5.12xlarge and 95 other(s)"} 2023-06-06T02:36:28.827Z DEBUG controller.machine_lifecycle discovered subnets {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "subnets": ["subnet-0ab64e974252c4e3d (eu-west-1c)", "subnet-0ea7f03a13ffa5ded (eu-west-1a)", "subnet-045cf720eaf23a71a (eu-west-1b)"]} 2023-06-06T02:36:29.673Z DEBUG controller.machine_lifecycle created launch template {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "launch-template-name": "karpenter.k8s.aws/12468674190133530911", "id": "lt-0e267e9bc36749325"} 2023-06-06T02:36:29.813Z DEBUG controller.machine_lifecycle created launch template {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "launch-template-name": "karpenter.k8s.aws/11435259210296915248", "id": "lt-07025b0531c8912a8"} 2023-06-06T02:36:31.957Z INFO controller.machine_lifecycle launched machine {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-instance", "instance-type": "c5.4xlarge", "zone": "eu-west-1c", "capacity-type": "spot", "allocatable": {"cpu":"15890m","ephemeral-storage":"17Gi","memory":"27700Mi","pods":"205"}} 2023-06-06T02:36:50.015Z DEBUG controller.machine_lifecycle registered machine {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-instance", "node": "ip-10-1-8-85.eu-west-1.compute.internal"} 2023-06-06T02:36:58.419Z DEBUG controller.machine_lifecycle initialized machine {"commit": "2be6616-dirty", "machine": "default-dq2p2", "provisioner": "default", "provider-id": "aws:///eu-west-1c/i-instance", "node": "ip-10-1-8-85.eu-west-1.compute.internal"} ```

New Kubernetes node:	Name	Status	Version	IP
ip-10-1-8-85.eu-west-1.compute.internal	Ready	v1.26.4-eks-597964d	10.1.11.1

The new node was deployed in subnet 10.1.8.0/22 which now has 924 IPv4 addresses available, a change of -75.

162 pods are stuck in the 'ContainerCreating' state.

kubectl get pods | grep ContainerCreating | wc -l
     162

88 pods are in the 'Running' state.

kubectl get pods | grep Running | wc -l
      88

kubectl get pods -owide

``` NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ahoy-hello-world-79f545d68-25kfp 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-296lc 1/1 Running 0 4m24s 10.1.10.161 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2f9bt 0/1 ContainerCreating 0 4m24s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2j7jh 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2kdr6 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2kl92 0/1 ContainerCreating 0 4m33s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2qtgv 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2rnt2 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2sz6q 0/1 ContainerCreating 0 4m31s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2t8sv 1/1 Running 0 4m23s 10.1.8.135 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-2wpjm 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-42m2j 1/1 Running 0 4m24s 10.1.8.132 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-46k7n 1/1 Running 0 4m23s 10.1.10.123 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-47sp8 1/1 Running 0 4m22s 10.1.10.119 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-49dzx 1/1 Running 0 4m24s 10.1.8.141 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4l2kz 0/1 ContainerCreating 0 4m33s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4ngcl 1/1 Running 0 4m33s 10.1.4.242 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4pckp 0/1 ContainerCreating 0 4m31s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4pplw 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4sf2f 0/1 ContainerCreating 0 4m31s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4t8fv 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4wxcx 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-4xbv4 0/1 ContainerCreating 0 4m25s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5czbf 1/1 Running 0 4m26s 10.1.10.157 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5fwtx 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5j6kz 1/1 Running 0 4m25s 10.1.6.232 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5kzlr 1/1 Running 0 4m26s 10.1.8.219 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5l2vl 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5nwvm 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5spz2 1/1 Running 0 4m27s 10.1.8.7 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5z4ng 1/1 Running 0 4m24s 10.1.8.140 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5zknl 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-5zwcd 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-68lck 1/1 Running 0 4m26s 10.1.11.9 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6nrjq 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6r29f 0/1 ContainerCreating 0 4m30s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6rc4h 1/1 Running 0 4m23s 10.1.8.129 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6w6sw 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6wbk8 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6wmpq 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6xkng 0/1 ContainerCreating 0 4m27s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6zflz 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-6zlzg 1/1 Running 0 4m24s 10.1.10.112 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7554c 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-784bd 0/1 ContainerCreating 0 4m33s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-784kn 0/1 ContainerCreating 0 4m31s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7d7fx 0/1 ContainerCreating 0 4m30s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7dt4f 0/1 ContainerCreating 0 4m30s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7f7nn 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7h9h2 1/1 Running 0 4m33s 10.1.8.238 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7jhrt 1/1 Running 0 4m26s 10.1.10.6 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7jkg2 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7kvlk 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7mpdj 0/1 ContainerCreating 0 4m27s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7q97s 1/1 Running 0 4m23s 10.1.10.168 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7rfwv 0/1 ContainerCreating 0 4m32s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7s2t4 1/1 Running 0 4m23s 10.1.10.164 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-7s8j6 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-82l6t 1/1 Running 0 4m24s 10.1.8.138 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-82vjv 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-867dm 1/1 Running 0 4m34s 10.1.6.203 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-8b6h2 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-8ll2s 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-95jlv 0/1 ContainerCreating 0 4m25s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-966xx 0/1 ContainerCreating 0 4m31s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-98h6f 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-99pfq 0/1 ContainerCreating 0 4m32s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-9bwqc 0/1 ContainerCreating 0 4m25s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-9hbpj 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-9xgpt 0/1 ContainerCreating 0 4m26s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-9zdqf 1/1 Running 0 4m33s 10.1.5.160 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-9zwdg 0/1 ContainerCreating 0 4m27s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-b4rqn 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-b552m 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-b6znx 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-b76v4 1/1 Running 0 4m34s 10.1.11.241 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-b9h6n 1/1 Running 0 4m34s 10.1.9.31 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-bbq9g 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-bphzm 0/1 ContainerCreating 0 4m27s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-bvkc9 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-c7lhf 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-c9qfn 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cb7fg 0/1 ContainerCreating 0 4m25s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cd9fk 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cfv59 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cgrxs 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cjfdb 0/1 ContainerCreating 0 4m27s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cnzpl 0/1 ContainerCreating 0 4m32s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cpbxc 1/1 Running 0 4m24s 10.1.10.118 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cv2x2 1/1 Running 0 4m34s 10.1.5.232 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-cxdl2 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-d4bvg 1/1 Running 0 4m34s 10.1.6.28 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-d57x5 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-d7w69 0/1 ContainerCreating 0 4m27s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-d8pvb 1/1 Running 0 4m34s 10.1.8.80 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-d9xpn 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dbdjh 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dd28q 1/1 Running 0 4m23s 10.1.8.142 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-djwql 1/1 Running 0 4m24s 10.1.8.143 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dmb99 0/1 ContainerCreating 0 4m25s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dq576 1/1 Running 0 4m23s 10.1.10.169 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dqpnw 1/1 Running 0 4m34s 10.1.11.117 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dvwdg 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-dw79n 1/1 Running 0 4m24s 10.1.10.160 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-fdvdv 1/1 Running 0 4m23s 10.1.10.120 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-fgn66 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-fsjdv 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-ft6d5 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-fw5wb 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-fz8sz 0/1 ContainerCreating 0 4m28s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-g5g52 0/1 ContainerCreating 0 4m27s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-g9l4h 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-gcsxs 0/1 ContainerCreating 0 4m29s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-gdn9c 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-gh2zd 1/1 Running 0 4m33s 10.1.6.138 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-gphcg 0/1 ContainerCreating 0 4m32s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-gqkcq 1/1 Running 0 4m24s 10.1.10.116 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-h57ht 0/1 ContainerCreating 0 4m30s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-h7fgv 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-h7wsv 0/1 ContainerCreating 0 4m27s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hb88g 1/1 Running 0 4m23s 10.1.8.131 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hbmvl 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hcsjh 0/1 ContainerCreating 0 4m33s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hffvh 1/1 Running 0 4m22s 10.1.10.175 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hfj62 1/1 Running 0 4m22s 10.1.10.167 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hg7m6 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-hsr9j 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-j4dm4 0/1 ContainerCreating 0 4m25s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-j6gjr 0/1 ContainerCreating 0 4m26s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-j7jcx 1/1 Running 0 4m26s 10.1.8.95 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-j8rtw 0/1 ContainerCreating 0 4m28s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-jqspb 0/1 ContainerCreating 0 4m32s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-jsckp 0/1 ContainerCreating 0 4m30s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-jvp6m 1/1 Running 0 4m33s 10.1.5.170 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-jznfw 1/1 Running 0 4m22s 10.1.10.113 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-k546f 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-k8ghk 0/1 ContainerCreating 0 4m30s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-kbd4g 1/1 Running 0 4m34s 10.1.11.47 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-klfvm 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-kntng 1/1 Running 0 4m23s 10.1.8.133 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-kr7bb 1/1 Running 0 4m33s 10.1.5.159 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-krhdf 0/1 ContainerCreating 0 4m32s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-ks6tq 1/1 Running 0 4m33s 10.1.9.91 ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l24sp 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l49tn 0/1 ContainerCreating 0 4m27s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l4fm9 1/1 Running 0 4m23s 10.1.10.166 ip-10-1-8-85.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l5bzv 1/1 Running 0 4m33s 10.1.5.57 ip-10-1-6-216.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l62mn 0/1 ContainerCreating 0 4m31s ip-10-1-8-251.eu-west-1.compute.internal ahoy-hello-world-79f545d68-l9n49 0/1 ContainerCreating 0 4m29s ip-10-1-6-216.eu-west-1.compute.internal

Pod events:

Events:
  Type     Reason                  Age                From               Message
  ----     ------                  ----               ----               -------
  Normal   Scheduled               15m                default-scheduler  Successfully assigned test/ahoy-hello-world-79f545d68-ztz4r to ip-10-1-6-216.eu-west-1.compute.internal
  Warning  FailedCreatePodSandBox  15m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2585244756197a04982514cc97947f5b7f702492df9776871fa3d7d3b3cba718": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  15m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e47bc00bb26d8f5751147bd3a10c74ea0d9416b87c6e9411ed9dea9c513dca58": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8af56f7579803fd048d6a8b34f6aeae43d347d0437c72bf31381d7df72089799": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "24006b0da1058621d72b7ce106ede4dc60cc89335c2291fe48680678a70ecfe8": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "02517baf7153f532104dc880e8c3183d27e24cc01612c1821f32481869b3219e": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d86eb35de4cd63ae8157201cda023d2aec89587b235c6043770eb4d41fd59369": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  14m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "119d185bd793cf479f837ce232e6ead47f7b0da2aa7670f50b1ac7e328ee259a": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  13m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a78c4acdab7165b1ea22de703d477fd63977fd1021110f1d4a0444f269312aaf": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  13m                kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e8f1243c54d26a1a400c78a73ab309108e16fb1032c97e71227e483ac2419e09": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container
  Warning  FailedCreatePodSandBox  1s (x62 over 12m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "204bd57a988811b5df8952be22b24dc593b5c0ed07fe17ab8152dcac12039395": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container

bwagner5 commented 1 year ago

This seems like a VPC CNI issue, it would be useful to open an issue on https://github.com/aws/amazon-vpc-cni-k8s

This troubleshooting guide for the VPC CNI may also be useful: https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/troubleshooting.md

kstevensonnv commented 1 year ago

Thanks for having a look @bwagner5, I've opened https://github.com/aws/amazon-vpc-cni-k8s/issues/2411.

igorgolm commented 1 year ago

@kstevensonnv please check this article https://aws.amazon.com/blogs/containers/amazon-vpc-cni-increases-pods-per-node-limits/

aws / karpenter-provider-aws