kubernetes-sigs / karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
Apache License 2.0
530 stars 174 forks source link

karpenter does not add `karpenter.sh/capacity-type` label to the nodes #660

Closed tmoreadobe closed 9 months ago

tmoreadobe commented 10 months ago

Description

Observed Behavior: For the nodes which is karpenter spinning up, karpenter is not adding karpenter.sh/capacity-type as both machine and provisioner has this label. Expected Behavior: karpenter should add the expected label for consolidation to work. Reproduction Steps (Please include YAML): provisionerspec

Name:         core-karpenter-ethos-core-karpenter-worker2
Namespace:    
Labels:       adobe.com/appname=karpenter-ethos101-dev-va6
              app.kubernetes.io/instance=core-karpenter
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ethos-core-karpenter
              app.kubernetes.io/version=v0.31.0
              ethos.adobe.net/managed-by=k8s-infrastructure.helm
              helm.sh/chart=ethos-core-karpenter-0.0.5
Annotations:  karpenter.sh/provisioner-hash: 1417663675986826836
API Version:  karpenter.sh/v1alpha5
Kind:         Provisioner
Metadata:
  Creation Timestamp:  2023-10-28T00:44:25Z
  Generation:          2
  Resource Version:    2306126405
  UID:                 6a302235-6c20-4a36-aa03-1c31caa7c7f3
Spec:
  Consolidation:
    Enabled:  true
  Labels:
    ethos.adobe.net/node-templateVersion:     a8a12c5ee31a4668f93b01ccf92ebf7c54679f693bd3a14b4bd86086432388
    kubernetes.io/os:                         linux
    node.kubernetes.io/ethos-workload.amd64:  true
    node.kubernetes.io/node-group:            worker2-group
    node.kubernetes.io/role:                  worker
  Limits:
    Resources:
      Cpu:     1k
      Memory:  1000Gi
  Provider Ref:
    Name:  core-karpenter-ethos-core-karpenter-worker2
  Requirements:
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      amd64
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1a
      us-east-1b
      us-east-1c
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
Status:
  Resources:
    Attachable - Volumes - Aws - Ebs:  25
    Cpu:                               16
    Ephemeral - Storage:               194314260Ki
    Memory:                            32054796Ki
    Pods:                              112
Events:                                <none>

machinespec

Name:         core-karpenter-ethos-core-karpenter-worker2-zjxnb
Namespace:    
Labels:       ethos.adobe.net/node-templateVersion=a8a12c5ee31a4668f93b01ccf92ebf7c54679f693bd3a14b4bd86086432388
              karpenter.k8s.aws/instance-category=c
              karpenter.k8s.aws/instance-cpu=72
              karpenter.k8s.aws/instance-encryption-in-transit-supported=false
              karpenter.k8s.aws/instance-family=c5
              karpenter.k8s.aws/instance-generation=5
              karpenter.k8s.aws/instance-hypervisor=nitro
              karpenter.k8s.aws/instance-memory=147456
              karpenter.k8s.aws/instance-network-bandwidth=25000
              karpenter.k8s.aws/instance-pods=737
              karpenter.k8s.aws/instance-size=18xlarge
              karpenter.sh/capacity-type=on-demand
              karpenter.sh/provisioner-name=core-karpenter-ethos-core-karpenter-worker2
              kubernetes.io/arch=amd64
              kubernetes.io/os=linux
              node.kubernetes.io/ethos-workload.amd64=true
              node.kubernetes.io/instance-type=c5.18xlarge
              node.kubernetes.io/node-group=worker2-group
              node.kubernetes.io/role=worker
              topology.kubernetes.io/region=us-east-1
              topology.kubernetes.io/zone=us-east-1a
Annotations:  karpenter.k8s.aws/nodetemplate-hash: 7594172995892393789
              karpenter.sh/managed-by: ethos101-dev-va6-k8s-eks-master
              karpenter.sh/provisioner-hash: 1417663675986826836
API Version:  karpenter.sh/v1alpha5
Kind:         Machine
Metadata:
  Creation Timestamp:  2023-10-30T19:02:52Z
  Finalizers:
    karpenter.sh/termination
  Generate Name:  core-karpenter-ethos-core-karpenter-worker2-
  Generation:     1
  Owner References:
    API Version:           karpenter.sh/v1alpha5
    Block Owner Deletion:  true
    Kind:                  Provisioner
    Name:                  core-karpenter-ethos-core-karpenter-worker2
    UID:                   6a302235-6c20-4a36-aa03-1c31caa7c7f3
  Resource Version:        2306109951
  UID:                     6b20ee72-270f-428e-98e4-dad254147d58
Spec:
  Machine Template Ref:
    Name:  core-karpenter-ethos-core-karpenter-worker2
  Requirements:
    Key:       karpenter.sh/provisioner-name
    Operator:  In
    Values:
      core-karpenter-ethos-core-karpenter-worker2
    Key:       ethos.adobe.net/node-templateVersion
    Operator:  In
    Values:
      a8a12c5ee31a4668f93b01ccf92ebf7c54679f693bd3a14b4bd86086432388
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1a
      us-east-1b
    Key:       node.kubernetes.io/ethos-workload.amd64
    Operator:  In
    Values:
      true
    Key:       node.kubernetes.io/role
    Operator:  In
    Values:
      worker
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      amd64
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
    Key:       node.kubernetes.io/instance-type
    Operator:  In
    Values:
      c5.12xlarge
      c5.18xlarge
      c5.24xlarge
      c5.4xlarge
      c5.9xlarge
      c5.metal
      m5.12xlarge
      m5.16xlarge
      m5.24xlarge
      m5.4xlarge
      m5.8xlarge
      m5.metal
      r5.12xlarge
      r5.16xlarge
      r5.24xlarge
      r5.4xlarge
      r5.8xlarge
      r5.metal
    Key:       node.kubernetes.io/node-group
    Operator:  In
    Values:
      worker2-group
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
  Resources:
    Requests:
      Cpu:     10237m
      Memory:  23845317289
      Pods:    19
Status:
  Allocatable:
    Cpu:                  71750m
    Ephemeral - Storage:  179Gi
    Memory:               127934Mi
    Pods:                 737
  Capacity:
    Cpu:                  72
    Ephemeral - Storage:  200Gi
    Memory:               136396Mi
    Pods:                 737
  Conditions:
    Last Transition Time:  2023-10-30T19:04:45Z
    Status:                True
    Type:                  MachineInitialized
    Last Transition Time:  2023-10-30T19:02:55Z
    Status:                True
    Type:                  MachineLaunched
    Last Transition Time:  2023-10-30T19:04:45Z
    Status:                True
    Type:                  Ready
    Last Transition Time:  2023-10-30T19:04:13Z
    Status:                True
    Type:                  MachineRegistered
  Node Name:               ip-10-95-64-98.ec2.internal
  Provider ID:             aws:///us-east-1a/i-0fc6c888b14b25ac9
Events:
  Type    Reason                 Age                  From       Message
  ----    ------                 ----                 ----       -------
  Normal  DeprovisioningBlocked  30s (x73 over 152m)  karpenter  Cannot deprovision Machine: Required label "karpenter.sh/capacity-type" doesn't exist

error on the node same as of machine Versions:

jonathan-innis commented 10 months ago

Can you share a Node spec and status that aren't getting the labels applied? Would be good to get a combination of the Provisioner, Machine, and Node that all match-up with each other to see exactly what's going on here

tmoreadobe commented 10 months ago

This is with again v0.30.0 nodespec

root@44ceee382e95:/infrastructure# kubectl describe node ip-10-10-12-85.ec2.internal
Name:               ip-10-10-12-85.ec2.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=arm64
                    beta.kubernetes.io/instance-type=c6g.16xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1b
                    k8s.io/cloud-provider-aws=6d2511cd5c3086d8e96592c5e706198a
                    karpenter.sh/initialized=true
                    karpenter.sh/registered=true
                    kubernetes.io/arch=arm64
                    kubernetes.io/hostname=ip-10-10-12-85.ec2.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/container-runtime=cri-o
                    node.kubernetes.io/ethos-workload.arm64=true
                    node.kubernetes.io/instance-lifecycle=normal
                    node.kubernetes.io/instance-type=c6g.16xlarge
                    node.kubernetes.io/role=worker
                    topology.ebs.csi.aws.com/zone=us-east-1b
                    topology.kubernetes.io/region=us-east-1
                    topology.kubernetes.io/zone=us-east-1b
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.10.12.85
                    csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0173872575c667969"}
                    io.cilium.network.ipv4-cilium-host: xxxxx
                    io.cilium.network.ipv4-pod-cidr: xxxxxx/24
                    karpenter.k8s.aws/nodetemplate-hash: 13719525917006148574
                    karpenter.sh/managed-by: karpenter-tmore
                    karpenter.sh/provisioner-hash: 6732253779156292384
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Tue, 31 Oct 2023 17:29:01 +0000
Taints:             ethos.corp.adobe.com/ethos-workload=arm64:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-10-12-85.ec2.internal
  AcquireTime:     <unset>
  RenewTime:       Tue, 31 Oct 2023 18:06:43 +0000
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  KernelDeadlock       False   Tue, 31 Oct 2023 18:04:49 +0000   Tue, 31 Oct 2023 17:29:44 +0000   KernelHasNoDeadlock          kernel has no deadlock
  ReadonlyFilesystem   False   Tue, 31 Oct 2023 18:04:49 +0000   Tue, 31 Oct 2023 17:29:44 +0000   FilesystemIsNotReadOnly      Filesystem is not read-only
  TaintedNode          False   Tue, 31 Oct 2023 18:04:49 +0000   Tue, 31 Oct 2023 17:29:44 +0000   NoActiveCustomConditions     No active custom node conditions
  CiliumUnavailable    False   Tue, 31 Oct 2023 18:04:49 +0000   Tue, 31 Oct 2023 17:29:44 +0000   CiliumIsUp                   Cilium pod is up
  NetworkUnavailable   False   Tue, 31 Oct 2023 17:29:35 +0000   Tue, 31 Oct 2023 17:29:35 +0000   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Tue, 31 Oct 2023 18:06:34 +0000   Tue, 31 Oct 2023 17:28:57 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Tue, 31 Oct 2023 18:06:34 +0000   Tue, 31 Oct 2023 17:28:57 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Tue, 31 Oct 2023 18:06:34 +0000   Tue, 31 Oct 2023 17:28:57 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Tue, 31 Oct 2023 18:06:34 +0000   Tue, 31 Oct 2023 17:29:26 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.10.12.85
  Hostname:     ip-10-10-12-85.ec2.internal
  InternalDNS:  ip-10-10-12-85.ec2.internal
Capacity:
  attachable-volumes-aws-ebs:  39
  cpu:                         64
  ephemeral-storage:           194314260Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  hugepages-32Mi:              0
  hugepages-64Ki:              0
  memory:                      129565140Ki
  pods:                        112
Allocatable:
  attachable-volumes-aws-ebs:  39
  cpu:                         62770m
  ephemeral-storage:           176798320344
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  hugepages-32Mi:              0
  hugepages-64Ki:              0
  memory:                      126481876Ki
  pods:                        112
System Info:
  Machine ID:                 ec23882950b6a371d98a17b5ec3d08d0
  System UUID:                ec238829-50b6-a371-d98a-17b5ec3d08d0
  Boot ID:                    7aca4e1b-810a-4086-8336-f9c9381a8747
  Kernel Version:             5.15.119-flatcar
  OS Image:                   Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)
  Operating System:           linux
  Architecture:               arm64
  Container Runtime Version:  cri-o://1.26.4
  Kubelet Version:            v1.26.9
  Kube-Proxy Version:         v1.26.9
ProviderID:                   aws:///us-east-1b/i-0173872575c667969
Non-terminated Pods:          (18 in total)
  Namespace                   Name                                  CPU Requests  CPU Limits  Memory Requests  Memory Limits    Age
  ---------                   ----                                  ------------  ----------  ---------------  -------------    ---
  kube-system                 aws-node-termination-handler-xkrnf    24m (0%)      72m (0%)    262144k (0%)     750Mi (0%)       37m
  kube-system                 cilium-dnsproxy-qj99s                 100m (0%)     450m (0%)   192Mi (0%)       1280Mi (1%)      37m
  kube-system                 cilium-frg5b                          550m (0%)     1200m (1%)  576Mi (0%)       4352Mi (3%)      37m
  kube-system                 cilium-node-init-vt4xf                24m (0%)      48m (0%)    262144k (0%)     500Mi (0%)       37m
  kube-system                 dnsmasq-8rrjl                         150m (0%)     2200m (3%)  74Mi (0%)        756Mi (0%)       37m
  kube-system                 dnsmasq-ha-w8tnf                      150m (0%)     2200m (3%)  74Mi (0%)        756Mi (0%)       37m
  kube-system                 ebs-csi-node-lqh9h                    44m (0%)      374m (0%)   262144k (0%)     1450Mi (1%)      37m
  kube-system                 fluent-bit-journald-48csf             25m (0%)      50m (0%)    262144k (0%)     500Mi (0%)       37m
  kube-system                 kube-proxy-psb9f                      200m (0%)     700m (1%)   131108864 (0%)   2268435456 (1%)  37m
  kube-system                 kube2iam-n6n2l                        350m (0%)     500m (0%)   576Mi (0%)       768Mi (0%)       37m
  kube-system                 node-problem-detector-2pknq           47m (0%)      118m (0%)   262144k (0%)     750Mi (0%)       37m
  monitoring                  cert-exporter-8spjh                   42m (0%)      108m (0%)   262144k (0%)     750Mi (0%)       37m
  monitoring                  node-exporter-xx8wp                   45m (0%)      136m (0%)   279620266 (0%)   1208518791 (0%)  37m
  monitoring                  prometheus-adapter-b68fc4cb4-hzdcv    24m (0%)      48m (0%)    262144k (0%)     500Mi (0%)       40m
  monitoring                  tee-caddy-5b9df6fdfc-dk5l5            512m (0%)     1048m (1%)  262144k (0%)     750Mi (0%)       11m
  monitoring                  thanos-query-7887d56477-lq7v5         283m (0%)     48m (0%)    314118954 (0%)   500Mi (0%)       3m14s
  monitoring                  thanos-ruler-1                        422m (0%)     721m (1%)   585534061 (0%)   1258449455 (0%)  21m
  tracing                     otel-agent-vdrhb                      24m (0%)      48m (0%)    262144k (0%)     756Mi (0%)       37m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests         Limits
  --------                    --------         ------
  cpu                         3016m (4%)       10069m (16%)
  memory                      5234153537 (4%)  20587775670 (15%)
  ephemeral-storage           0 (0%)           0 (0%)
  hugepages-1Gi               0 (0%)           0 (0%)
  hugepages-2Mi               0 (0%)           0 (0%)
  hugepages-32Mi              0 (0%)           0 (0%)
  hugepages-64Ki              0 (0%)           0 (0%)
  attachable-volumes-aws-ebs  0                0
Events:
  Type    Reason                   Age                  From             Message
  ----    ------                   ----                 ----             -------
  Normal  Starting                 37m                  kube-proxy       
  Normal  Starting                 37m                  kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  37m (x2 over 37m)    kubelet          Node ip-10-10-12-85.ec2.internal status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    37m (x2 over 37m)    kubelet          Node ip-10-10-12-85.ec2.internal status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     37m (x2 over 37m)    kubelet          Node ip-10-10-12-85.ec2.internal status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  37m                  kubelet          Updated Node Allocatable limit across pods
  Normal  RegisteredNode           37m                  node-controller  Node ip-10-10-12-85.ec2.internal event: Registered Node ip-10-10-12-85.ec2.internal in Controller
  Normal  NodeReady                37m                  kubelet          Node ip-10-10-12-85.ec2.internal status is now: NodeReady
  Normal  DeprovisioningBlocked    115s (x15 over 37m)  karpenter        Cannot deprovision Node: Required label "karpenter.sh/capacity-type" doesn't exist

logs from karpenter pods about machine and provisioner names

root@44ceee382e95:/infrastructure# kubectl logs -n karpenter core-karpenter-7c69dd99cc-v95c8 | grep ip-10-10-12-85.ec2.internal
2023-10-31T17:29:01.605Z    DEBUG   controller.machine.lifecycle    registered machine  {"commit": "637a642", "machine": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-vxphm", "provisioner": "core-karpenter-ethos-core-karpenter-armc6g4xlarge", "provider-id": "aws:///us-east-1b/i-0173872575c667969", "node": "ip-10-10-12-85.ec2.internal"}
2023-10-31T17:29:27.879Z    DEBUG   controller.machine.lifecycle    initialized machine {"commit": "637a642", "machine": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-vxphm", "provisioner": "core-karpenter-ethos-core-karpenter-armc6g4xlarge", "provider-id": "aws:///us-east-1b/i-0173872575c667969", "node": "ip-10-10-12-85.ec2.internal"}

machinespec

root@44ceee382e95:/infrastructure# kubectl describe machine core-karpenter-ethos-core-karpenter-armc6g4xlarge-vxphm
Name:         core-karpenter-ethos-core-karpenter-armc6g4xlarge-vxphm
Namespace:    
Labels:       ethos.adobe.net/node-templateVersion=f52cb0834d690e1f0d77747203cf126c53b8fd82a82afb02ea61915f937853
              karpenter.k8s.aws/instance-category=c
              karpenter.k8s.aws/instance-cpu=64
              karpenter.k8s.aws/instance-encryption-in-transit-supported=false
              karpenter.k8s.aws/instance-family=c6g
              karpenter.k8s.aws/instance-generation=6
              karpenter.k8s.aws/instance-hypervisor=nitro
              karpenter.k8s.aws/instance-memory=131072
              karpenter.k8s.aws/instance-network-bandwidth=25000
              karpenter.k8s.aws/instance-pods=737
              karpenter.k8s.aws/instance-size=16xlarge
              karpenter.sh/capacity-type=on-demand
              karpenter.sh/provisioner-name=core-karpenter-ethos-core-karpenter-armc6g4xlarge
              kubernetes.io/arch=arm64
              kubernetes.io/os=linux
              node.kubernetes.io/ethos-workload.arm64=true
              node.kubernetes.io/instance-type=c6g.16xlarge
              node.kubernetes.io/role=worker
              topology.kubernetes.io/region=us-east-1
              topology.kubernetes.io/zone=us-east-1b
Annotations:  karpenter.k8s.aws/nodetemplate-hash: 13719525917006148574
              karpenter.sh/managed-by: karpenter-tmore
              karpenter.sh/provisioner-hash: 6732253779156292384
API Version:  karpenter.sh/v1alpha5
Kind:         Machine
Metadata:
  Creation Timestamp:  2023-10-31T17:27:43Z
  Finalizers:
    karpenter.sh/termination
  Generate Name:  core-karpenter-ethos-core-karpenter-armc6g4xlarge-
  Generation:     1
  Managed Fields:
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:karpenter.k8s.aws/nodetemplate-hash:
          f:karpenter.sh/managed-by:
          f:karpenter.sh/provisioner-hash:
        f:finalizers:
          .:
          v:"karpenter.sh/termination":
        f:generateName:
        f:labels:
          .:
          f:ethos.adobe.net/node-templateVersion:
          f:karpenter.k8s.aws/instance-category:
          f:karpenter.k8s.aws/instance-cpu:
          f:karpenter.k8s.aws/instance-encryption-in-transit-supported:
          f:karpenter.k8s.aws/instance-family:
          f:karpenter.k8s.aws/instance-generation:
          f:karpenter.k8s.aws/instance-hypervisor:
          f:karpenter.k8s.aws/instance-memory:
          f:karpenter.k8s.aws/instance-network-bandwidth:
          f:karpenter.k8s.aws/instance-pods:
          f:karpenter.k8s.aws/instance-size:
          f:karpenter.sh/capacity-type:
          f:karpenter.sh/provisioner-name:
          f:kubernetes.io/arch:
          f:kubernetes.io/os:
          f:node.kubernetes.io/ethos-workload.arm64:
          f:node.kubernetes.io/instance-type:
          f:node.kubernetes.io/role:
          f:topology.kubernetes.io/region:
          f:topology.kubernetes.io/zone:
        f:ownerReferences:
          .:
          k:{"uid":"5fbda74f-9d35-449b-aa9d-3bd137b4dfd6"}:
      f:spec:
        .:
        f:machineTemplateRef:
          .:
          f:name:
        f:requirements:
        f:resources:
          .:
          f:requests:
            .:
            f:cpu:
            f:memory:
            f:pods:
        f:taints:
    Manager:      karpenter
    Operation:    Update
    Time:         2023-10-31T17:27:47Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:allocatable:
          .:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
        f:capacity:
          .:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
        f:conditions:
        f:nodeName:
        f:providerID:
    Manager:      karpenter
    Operation:    Update
    Subresource:  status
    Time:         2023-10-31T17:29:27Z
  Owner References:
    API Version:           karpenter.sh/v1alpha5
    Block Owner Deletion:  true
    Kind:                  Provisioner
    Name:                  core-karpenter-ethos-core-karpenter-armc6g4xlarge
    UID:                   5fbda74f-9d35-449b-aa9d-3bd137b4dfd6
  Resource Version:        551073
  UID:                     ddaffdaf-8896-4ee6-b4e3-2aabab177308
Spec:
  Machine Template Ref:
    Name:  core-karpenter-ethos-core-karpenter-armc6g4xlarge
  Requirements:
    Key:       node.kubernetes.io/ethos-workload.arm64
    Operator:  In
    Values:
      true
    Key:       karpenter.sh/provisioner-name
    Operator:  In
    Values:
      core-karpenter-ethos-core-karpenter-armc6g4xlarge
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
    Key:       node.kubernetes.io/role
    Operator:  In
    Values:
      worker
    Key:       ethos.adobe.net/node-templateVersion
    Operator:  In
    Values:
      f52cb0834d690e1f0d77747203cf126c53b8fd82a82afb02ea61915f937853
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      arm64
    Key:       node.kubernetes.io/instance-type
    Operator:  In
    Values:
      c6g.12xlarge
      c6g.16xlarge
      c6g.2xlarge
      c6g.4xlarge
      c6g.8xlarge
      c6g.metal
      c6g.xlarge
      m6g.12xlarge
      m6g.16xlarge
      m6g.2xlarge
      m6g.4xlarge
      m6g.8xlarge
      m6g.large
      m6g.metal
      m6g.xlarge
      r6g.12xlarge
      r6g.16xlarge
      r6g.2xlarge
      r6g.4xlarge
      r6g.8xlarge
      r6g.large
      r6g.metal
      r6g.xlarge
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1b
      us-east-1c
  Resources:
    Requests:
      Cpu:     1799m
      Memory:  4072356522
      Pods:    15
  Taints:
    Effect:  NoSchedule
    Key:     ethos.corp.adobe.com/ethos-workload
    Value:   arm64
Status:
  Allocatable:
    Cpu:                  63770m
    Ephemeral - Storage:  179Gi
    Memory:               112720Mi
    Pods:                 737
  Capacity:
    Cpu:                  64
    Ephemeral - Storage:  200Gi
    Memory:               121182Mi
    Pods:                 737
  Conditions:
    Last Transition Time:  2023-10-31T17:29:27Z
    Status:                True
    Type:                  MachineInitialized
    Last Transition Time:  2023-10-31T17:27:47Z
    Status:                True
    Type:                  MachineLaunched
    Last Transition Time:  2023-10-31T17:29:01Z
    Status:                True
    Type:                  MachineRegistered
    Last Transition Time:  2023-10-31T17:29:27Z
    Status:                True
    Type:                  Ready
  Node Name:               ip-10-10-12-85.ec2.internal
  Provider ID:             aws:///us-east-1b/i-0173872575c667969
Events:
  Type     Reason                  Age                 From       Message
  ----     ------                  ----                ----       -------
  Warning  FailedConsistencyCheck  2m8s (x4 over 32m)  karpenter  expected 737 of resource pods, but found 112 (15.2% of expected)
  Normal   DeprovisioningBlocked   57s (x17 over 40m)  karpenter  Cannot deprovision Machine: Required label "karpenter.sh/capacity-type" doesn't exist

provisioner

root@44ceee382e95:/infrastructure# kubectl describe provisioner core-karpenter-ethos-core-karpenter-armc6g4xlarge
Name:         core-karpenter-ethos-core-karpenter-armc6g4xlarge
Namespace:    
Labels:       adobe.com/appname=karpenter-tmoretest-sbx-va6
              app.kubernetes.io/instance=core-karpenter
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ethos-core-karpenter
              app.kubernetes.io/version=v0.30.0
              ethos.adobe.net/managed-by=k8s-infrastructure.helm
              helm.sh/chart=ethos-core-karpenter-0.0.2
Annotations:  karpenter.sh/provisioner-hash: 6732253779156292384
API Version:  karpenter.sh/v1alpha5
Kind:         Provisioner
Metadata:
  Creation Timestamp:  2023-10-31T07:21:01Z
  Generation:          2
  Managed Fields:
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
        f:labels:
          .:
          f:adobe.com/appname:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
          f:ethos.adobe.net/managed-by:
          f:helm.sh/chart:
      f:spec:
        .:
        f:consolidation:
          .:
          f:enabled:
        f:labels:
          .:
          f:ethos.adobe.net/node-templateVersion:
          f:kubernetes.io/os:
          f:node.kubernetes.io/ethos-workload.arm64:
          f:node.kubernetes.io/role:
        f:limits:
          .:
          f:resources:
            .:
            f:cpu:
            f:memory:
        f:providerRef:
          .:
          f:name:
        f:requirements:
        f:taints:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2023-10-31T16:27:52Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:karpenter.sh/provisioner-hash:
    Manager:      karpenter
    Operation:    Update
    Time:         2023-10-31T16:59:07Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:resources:
          .:
          f:attachable-volumes-aws-ebs:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
    Manager:         karpenter
    Operation:       Update
    Subresource:     status
    Time:            2023-10-31T17:29:03Z
  Resource Version:  550592
  UID:               5fbda74f-9d35-449b-aa9d-3bd137b4dfd6
Spec:
  Consolidation:
    Enabled:  true
  Labels:
    ethos.adobe.net/node-templateVersion:     f52cb0834d690e1f0d77747203cf126c53b8fd82a82afb02ea61915f937853
    kubernetes.io/os:                         linux
    node.kubernetes.io/ethos-workload.arm64:  true
    node.kubernetes.io/role:                  worker
  Limits:
    Resources:
      Cpu:     500
      Memory:  500Gi
  Provider Ref:
    Name:  core-karpenter-ethos-core-karpenter-armc6g4xlarge
  Requirements:
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      arm64
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1a
      us-east-1b
      us-east-1c
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
  Taints:
    Effect:  NoSchedule
    Key:     ethos.corp.adobe.com/ethos-workload
    Value:   arm64
Status:
  Resources:
    Attachable - Volumes - Aws - Ebs:  78
    Cpu:                               8
    Ephemeral - Storage:               388628520Ki
    Memory:                            15950388Ki
    Pods:                              224
Events:                                <none>
tmoreadobe commented 10 months ago

I want to mention one more thing that this does work fine with the version0.27.3 but haven't had a chance to test the versions between this and 0.30.0.

With 0.27.3 capacity-type label is added before initialized label is added. But somehow that seems to have changed in one of the later releases. I couldn't figure out which one.

jmdeal commented 10 months ago

To clarify was this node created by Karpenter v0.27.3 or v0.30.0. It sounds like its the latter but I just want to make sure. Also would you be able to provide the output from kubectl get node <node-name> -o yaml? It doesn't look like kubectl describe includes its metadata.

tmoreadobe commented 10 months ago

The above was created with 0.30.0. I created new one with 0.0.31 and this is the output of one of the node

node describe

root@d86e4fa28c12:/infrastructure# kubectl describe node  ip-10-10-11-43.ec2.internal
Name:               ip-10-10-11-43.ec2.internal
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c5.18xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1a
                    k8s.io/cloud-provider-aws=6d2511cd5c3086d8e96592c5e706198a
                    karpenter.sh/initialized=true
                    karpenter.sh/registered=true
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-10-11-43.ec2.internal
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/container-runtime=cri-o
                    node.kubernetes.io/instance-lifecycle=normal
                    node.kubernetes.io/instance-type=c5.18xlarge
                    node.kubernetes.io/node-group=worker0-group
                    node.kubernetes.io/role=worker
                    topology.ebs.csi.aws.com/zone=us-east-1a
                    topology.kubernetes.io/region=us-east-1
                    topology.kubernetes.io/zone=us-east-1a
Annotations:        alpha.kubernetes.io/provided-node-ip: 10.10.11.43
                    csi.volume.kubernetes.io/nodeid: {"ebs.csi.aws.com":"i-0bb71babdf2055662"}
                    io.cilium.network.ipv4-cilium-host: xxxxxx
                    io.cilium.network.ipv4-pod-cidr: xxxxxxx/24
                    karpenter.k8s.aws/nodetemplate-hash: 8114642816259925329
                    karpenter.sh/managed-by: karpenter-tmore
                    karpenter.sh/provisioner-hash: 8429816451612170557
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 02 Nov 2023 01:46:27 +0000
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-10-11-43.ec2.internal
  AcquireTime:     <unset>
  RenewTime:       Thu, 02 Nov 2023 01:55:19 +0000
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  ReadonlyFilesystem   False   Thu, 02 Nov 2023 01:52:18 +0000   Thu, 02 Nov 2023 01:47:16 +0000   FilesystemIsNotReadOnly      Filesystem is not read-only
  TaintedNode          False   Thu, 02 Nov 2023 01:52:18 +0000   Thu, 02 Nov 2023 01:47:16 +0000   NoActiveCustomConditions     No active custom node conditions
  CiliumUnavailable    False   Thu, 02 Nov 2023 01:52:18 +0000   Thu, 02 Nov 2023 01:47:16 +0000   CiliumIsUp                   Cilium pod is up
  KernelDeadlock       False   Thu, 02 Nov 2023 01:52:18 +0000   Thu, 02 Nov 2023 01:47:16 +0000   KernelHasNoDeadlock          kernel has no deadlock
  NetworkUnavailable   False   Thu, 02 Nov 2023 01:47:05 +0000   Thu, 02 Nov 2023 01:47:05 +0000   CiliumIsUp                   Cilium is running on this node
  MemoryPressure       False   Thu, 02 Nov 2023 01:55:19 +0000   Thu, 02 Nov 2023 01:46:23 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Thu, 02 Nov 2023 01:55:19 +0000   Thu, 02 Nov 2023 01:46:23 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Thu, 02 Nov 2023 01:55:19 +0000   Thu, 02 Nov 2023 01:46:23 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Thu, 02 Nov 2023 01:55:19 +0000   Thu, 02 Nov 2023 01:46:55 +0000   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   10.10.11.43
  Hostname:     ip-10-10-11-43.ec2.internal
  InternalDNS:  ip-10-10-11-43.ec2.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         72
  ephemeral-storage:           194314260Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      144124784Ki
  pods:                        112
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         70750m
  ephemeral-storage:           176798320344
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      141041520Ki
  pods:                        112
System Info:
  Machine ID:                 ec21002a77c893a89c37e76527bcf77a
  System UUID:                ec21002a-77c8-93a8-9c37-e76527bcf77a
  Boot ID:                    e38243a7-ff9f-4424-8727-1c6c5d88d9ff
  Kernel Version:             5.15.119-flatcar
  OS Image:                   Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  cri-o://1.26.4
  Kubelet Version:            v1.26.9
  Kube-Proxy Version:         v1.26.9
ProviderID:                   aws:///us-east-1a/i-0bb71babdf2055662
Non-terminated Pods:          (21 in total)
  Namespace                   Name                                             CPU Requests  CPU Limits  Memory Requests  Memory Limits    Age
  ---------                   ----                                             ------------  ----------  ---------------  -------------    ---
  canary                      canary-bd46f6cc5-9zkfl                           24m (0%)      72m (0%)    262144k (0%)     750Mi (0%)       5m54s
  canary                      canary-tcp-796bfc459-ds8zd                       24m (0%)      72m (0%)    262144k (0%)     750Mi (0%)       5m54s
  gatekeeper                  gatekeeper-controller-manager-d47d8b95d-vqmnq    1012m (1%)    48m (0%)    294450051 (0%)   1177800204 (0%)  2m36s
  heptio-contour              envoy-84b696cc7d-snhsc                           7300m (10%)   700m (0%)   1216Mi (0%)      31488Mi (22%)    5m54s
  heptio-contour              kapcom-6974bd978d-ls7tc                          2050m (2%)    200m (0%)   864Mi (0%)       256Mi (0%)       5m54s
  kube-system                 aws-node-termination-handler-vbc5s               24m (0%)      72m (0%)    262144k (0%)     750Mi (0%)       8m26s
  kube-system                 cilium-dnsproxy-2k5g7                            100m (0%)     450m (0%)   192Mi (0%)       1280Mi (0%)      8m54s
  kube-system                 cilium-kvjlq                                     550m (0%)     1200m (1%)  576Mi (0%)       4352Mi (3%)      8m54s
  kube-system                 cilium-node-init-kbmv5                           24m (0%)      48m (0%)    262144k (0%)     500Mi (0%)       8m54s
  kube-system                 dnsmasq-ha-kmfwt                                 150m (0%)     2200m (3%)  74Mi (0%)        756Mi (0%)       8m26s
  kube-system                 dnsmasq-x8s9b                                    150m (0%)     2200m (3%)  74Mi (0%)        756Mi (0%)       8m26s
  kube-system                 ebs-csi-node-42r9w                               44m (0%)      374m (0%)   262144k (0%)     1450Mi (1%)      8m26s
  kube-system                 fluent-bit-journald-7nqgv                        25m (0%)      50m (0%)    262144k (0%)     500Mi (0%)       8m25s
  kube-system                 kube-proxy-2tsxp                                 200m (0%)     700m (0%)   131108864 (0%)   2268435456 (1%)  8m54s
  kube-system                 kube2iam-vdgvf                                   350m (0%)     500m (0%)   576Mi (0%)       768Mi (0%)       8m26s
  kube-system                 node-problem-detector-rflsn                      47m (0%)      118m (0%)   262144k (0%)     750Mi (0%)       8m25s
  monitoring                  cert-exporter-nmwkn                              42m (0%)      108m (0%)   262144k (0%)     750Mi (0%)       8m26s
  monitoring                  node-exporter-m8cvs                              45m (0%)      136m (0%)   279620266 (0%)   1208518791 (0%)  8m26s
  monitoring                  thanos-compact-k8s-c-0                           24m (0%)      48m (0%)    262144k (0%)     500Mi (0%)       5m8s
  opa                         opa-79dfc84d8c-9wrd7                             1261m (1%)    4544m (6%)  262143999 (0%)   873813330 (0%)   5m54s
  tracing                     otel-agent-v689g                                 24m (0%)      48m (0%)    262144k (0%)     756Mi (0%)       8m26s
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests         Limits
  --------                    --------         ------
  cpu                         13470m (19%)     13888m (19%)
  memory                      7334276652 (5%)  54929080293 (38%)
  ephemeral-storage           0 (0%)           0 (0%)
  hugepages-1Gi               0 (0%)           0 (0%)
  hugepages-2Mi               0 (0%)           0 (0%)
  attachable-volumes-aws-ebs  0                0
Events:
  Type    Reason                   Age                    From             Message
  ----    ------                   ----                   ----             -------
  Normal  Starting                 8m47s                  kube-proxy       
  Normal  Starting                 8m58s                  kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  8m58s (x2 over 8m58s)  kubelet          Node ip-10-10-11-43.ec2.internal status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    8m58s (x2 over 8m58s)  kubelet          Node ip-10-10-11-43.ec2.internal status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     8m58s (x2 over 8m58s)  kubelet          Node ip-10-10-11-43.ec2.internal status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  8m58s                  kubelet          Updated Node Allocatable limit across pods
  Normal  RegisteredNode           8m52s                  node-controller  Node ip-10-10-11-43.ec2.internal event: Registered Node ip-10-10-11-43.ec2.internal in Controller
  Normal  NodeReady                8m26s                  kubelet          Node ip-10-10-11-43.ec2.internal status is now: NodeReady
  Normal  DeprovisioningBlocked    14s (x5 over 8m20s)    karpenter        Cannot deprovision Node: Required label "karpenter.sh/capacity-type" doesn't exist

node yaml

apiVersion: v1
kind: Node
metadata:
  annotations:
    alpha.kubernetes.io/provided-node-ip: 10.10.11.43
    csi.volume.kubernetes.io/nodeid: '{"ebs.csi.aws.com":"i-0bb71babdf2055662"}'
    io.cilium.network.ipv4-cilium-host: xxxxxx
    io.cilium.network.ipv4-pod-cidr: xxxxxxxx/24
    karpenter.k8s.aws/nodetemplate-hash: "8114642816259925329"
    karpenter.sh/managed-by: karpenter-tmore
    karpenter.sh/provisioner-hash: "8429816451612170557"
    node.alpha.kubernetes.io/ttl: "0"
    volumes.kubernetes.io/controller-managed-attach-detach: "true"
  creationTimestamp: "2023-11-02T01:46:27Z"
  finalizers:
  - karpenter.sh/termination
  labels:
    beta.kubernetes.io/arch: amd64
    beta.kubernetes.io/instance-type: c5.18xlarge
    beta.kubernetes.io/os: linux
    failure-domain.beta.kubernetes.io/region: us-east-1
    failure-domain.beta.kubernetes.io/zone: us-east-1a
    k8s.io/cloud-provider-aws: 6d2511cd5c3086d8e96592c5e706198a
    karpenter.sh/initialized: "true"
    karpenter.sh/registered: "true"
    kubernetes.io/arch: amd64
    kubernetes.io/hostname: ip-10-10-11-43.ec2.internal
    kubernetes.io/os: linux
    node-role.kubernetes.io/worker: ""
    node.kubernetes.io/container-runtime: cri-o
    node.kubernetes.io/instance-lifecycle: normal
    node.kubernetes.io/instance-type: c5.18xlarge
    node.kubernetes.io/node-group: worker0-group
    node.kubernetes.io/role: worker
    topology.ebs.csi.aws.com/zone: us-east-1a
    topology.kubernetes.io/region: us-east-1
    topology.kubernetes.io/zone: us-east-1a
  name: ip-10-10-11-43.ec2.internal
  ownerReferences:
  - apiVersion: karpenter.sh/v1alpha5
    blockOwnerDeletion: true
    kind: Machine
    name: core-karpenter-ethos-core-karpenter-worker0-2t4ct
    uid: 2ff95528-47c9-4df7-8bf6-fd3bfd866346
  resourceVersion: "71693"
  uid: 4cf32004-ba39-4286-9e5a-4ad3530bb606
spec:
  providerID: aws:///us-east-1a/i-0bb71babdf2055662
status:
  addresses:
  - address: 10.10.11.43
    type: InternalIP
  - address: ip-10-10-11-43.ec2.internal
    type: Hostname
  - address: ip-10-10-11-43.ec2.internal
    type: InternalDNS
  allocatable:
    attachable-volumes-aws-ebs: "25"
    cpu: 70750m
    ephemeral-storage: "176798320344"
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 141041520Ki
    pods: "112"
  capacity:
    attachable-volumes-aws-ebs: "25"
    cpu: "72"
    ephemeral-storage: 194314260Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 144124784Ki
    pods: "112"
  conditions:
  - lastHeartbeatTime: "2023-11-02T01:47:17Z"
    lastTransitionTime: "2023-11-02T01:47:16Z"
    message: Cilium pod is up
    reason: CiliumIsUp
    status: "False"
    type: CiliumUnavailable
  - lastHeartbeatTime: "2023-11-02T01:47:17Z"
    lastTransitionTime: "2023-11-02T01:47:16Z"
    message: kernel has no deadlock
    reason: KernelHasNoDeadlock
    status: "False"
    type: KernelDeadlock
  - lastHeartbeatTime: "2023-11-02T01:47:17Z"
    lastTransitionTime: "2023-11-02T01:47:16Z"
    message: Filesystem is not read-only
    reason: FilesystemIsNotReadOnly
    status: "False"
    type: ReadonlyFilesystem
  - lastHeartbeatTime: "2023-11-02T01:47:17Z"
    lastTransitionTime: "2023-11-02T01:47:16Z"
    message: No active custom node conditions
    reason: NoActiveCustomConditions
    status: "False"
    type: TaintedNode
  - lastHeartbeatTime: "2023-11-02T01:47:05Z"
    lastTransitionTime: "2023-11-02T01:47:05Z"
    message: Cilium is running on this node
    reason: CiliumIsUp
    status: "False"
    type: NetworkUnavailable
  - lastHeartbeatTime: "2023-11-02T01:47:18Z"
    lastTransitionTime: "2023-11-02T01:46:23Z"
    message: kubelet has sufficient memory available
    reason: KubeletHasSufficientMemory
    status: "False"
    type: MemoryPressure
  - lastHeartbeatTime: "2023-11-02T01:47:18Z"
    lastTransitionTime: "2023-11-02T01:46:23Z"
    message: kubelet has no disk pressure
    reason: KubeletHasNoDiskPressure
    status: "False"
    type: DiskPressure
  - lastHeartbeatTime: "2023-11-02T01:47:18Z"
    lastTransitionTime: "2023-11-02T01:46:23Z"
    message: kubelet has sufficient PID available
    reason: KubeletHasSufficientPID
    status: "False"
    type: PIDPressure
  - lastHeartbeatTime: "2023-11-02T01:47:18Z"
    lastTransitionTime: "2023-11-02T01:46:55Z"
    message: kubelet is posting ready status
    reason: KubeletReady
    status: "True"
    type: Ready
  daemonEndpoints:
    kubeletEndpoint:
      Port: 10250
  images:
  - names:
    - xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:24d76a5d635f48f290fb1228bacbaf50d6f57911bbb1b9e4317e31483c10d518
    - xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:fce20727dc7d87f92970c885427e6df9c09e3b24e5350b8dbe26d22fe40ad8aa
    - xxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev:v1.12.14-cee.1-877
    sizeBytes: 491758129
  - names:
    - xxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:d8c8e3e8fe630c3f2d84a22722d4891343196483ac4cc02c1ba9345b1bfc8a3d
    - xxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:eb87627d3e9f4cf265b9edab411b0bd086a1b68ee13492f89f7030083313bf13
    - xxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy:v1.26.9
    sizeBytes: 67969765
  - names:
    - xxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:5842741976f6aaa92e985cc55f6dde98d70e07cbcb531803eadeb02839d30496
    - xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:c022c72c9de525ad5683f58831b5e12eaf885906e6e225183193c1c42db896a2
    - xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy:v1.12.8
    sizeBytes: 51579045
  - names:
    - xxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:a1454ca1f93b69ecd2c43482c8e13dc418ae15e28a46009f5934300a20afbdba
    - xxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:e1d442546e868db1a3289166c14011e0dbd32115b338b963e56f830972bc22a2
    - xxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script:62093c5c233ea914bfa26a10ba41f8780d9b737f
    sizeBytes: 16471209
  - names:
    - registry.k8s.io/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097
    - registry.k8s.io/pause@sha256:8d4106c88ec0bd28001e34c975d65175d994072d65341f62a8ab0754b0fafe10
    - registry.k8s.io/pause:3.9
    sizeBytes: 750414
  nodeInfo:
    architecture: amd64
    bootID: e38243a7-ff9f-4424-8727-1c6c5d88d9ff
    containerRuntimeVersion: cri-o://1.26.4
    kernelVersion: 5.15.119-flatcar
    kubeProxyVersion: v1.26.9
    kubeletVersion: v1.26.9
    machineID: ec21002a77c893a89c37e76527bcf77a
    operatingSystem: linux
    osImage: Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)
    systemUUID: ec21002a-77c8-93a8-9c37-e76527bcf77a

karpenter pod logs

root@d86e4fa28c12:/infrastructure# kubectl logs -n karpenter core-karpenter-95c7b5f6b-hv9gt | grep ip-10-10-11-43.ec2.internal
2023-11-02T01:46:27.699Z  DEBUG controller.machine.lifecycle  registered machine  {"commit": "322822a", "machine": "core-karpenter-ethos-core-karpenter-worker0-2t4ct", "provisioner": "core-karpenter-ethos-core-karpenter-worker0", "provider-id": "aws:///us-east-1a/i-0bb71babdf2055662", "node": "ip-10-10-11-43.ec2.internal"}
2023-11-02T01:46:56.256Z  DEBUG controller.machine.lifecycle  initialized machine {"commit": "322822a", "machine": "core-karpenter-ethos-core-karpenter-worker0-2t4ct", "provisioner": "core-karpenter-ethos-core-karpenter-worker0", "provider-id": "aws:///us-east-1a/i-0bb71babdf2055662", "node": "ip-10-10-11-43.ec2.internal"}

machine describe

root@d86e4fa28c12:/infrastructure# kubectl describe machine core-karpenter-ethos-core-karpenter-worker0-2t4ct
Name:         core-karpenter-ethos-core-karpenter-worker0-2t4ct
Namespace:    
Labels:       ethos.adobe.net/node-templateVersion=3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
              karpenter.k8s.aws/instance-category=c
              karpenter.k8s.aws/instance-cpu=72
              karpenter.k8s.aws/instance-encryption-in-transit-supported=false
              karpenter.k8s.aws/instance-family=c5
              karpenter.k8s.aws/instance-generation=5
              karpenter.k8s.aws/instance-hypervisor=nitro
              karpenter.k8s.aws/instance-memory=147456
              karpenter.k8s.aws/instance-network-bandwidth=25000
              karpenter.k8s.aws/instance-pods=737
              karpenter.k8s.aws/instance-size=18xlarge
              karpenter.sh/capacity-type=on-demand
              karpenter.sh/provisioner-name=core-karpenter-ethos-core-karpenter-worker0
              kubernetes.io/arch=amd64
              kubernetes.io/os=linux
              node.kubernetes.io/ethos-workload.amd64=true
              node.kubernetes.io/instance-type=c5.18xlarge
              node.kubernetes.io/node-group=worker0-group
              node.kubernetes.io/role=worker
              topology.kubernetes.io/region=us-east-1
              topology.kubernetes.io/zone=us-east-1a
Annotations:  karpenter.k8s.aws/nodetemplate-hash: 8114642816259925329
              karpenter.sh/managed-by: karpenter-tmore
              karpenter.sh/provisioner-hash: 8429816451612170557
API Version:  karpenter.sh/v1alpha5
Kind:         Machine
Metadata:
  Creation Timestamp:  2023-11-02T01:43:42Z
  Finalizers:
    karpenter.sh/termination
  Generate Name:  core-karpenter-ethos-core-karpenter-worker0-
  Generation:     1
  Managed Fields:
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:karpenter.k8s.aws/nodetemplate-hash:
          f:karpenter.sh/managed-by:
          f:karpenter.sh/provisioner-hash:
        f:finalizers:
          .:
          v:"karpenter.sh/termination":
        f:generateName:
        f:labels:
          .:
          f:ethos.adobe.net/node-templateVersion:
          f:karpenter.k8s.aws/instance-category:
          f:karpenter.k8s.aws/instance-cpu:
          f:karpenter.k8s.aws/instance-encryption-in-transit-supported:
          f:karpenter.k8s.aws/instance-family:
          f:karpenter.k8s.aws/instance-generation:
          f:karpenter.k8s.aws/instance-hypervisor:
          f:karpenter.k8s.aws/instance-memory:
          f:karpenter.k8s.aws/instance-network-bandwidth:
          f:karpenter.k8s.aws/instance-pods:
          f:karpenter.k8s.aws/instance-size:
          f:karpenter.sh/capacity-type:
          f:karpenter.sh/provisioner-name:
          f:kubernetes.io/arch:
          f:kubernetes.io/os:
          f:node.kubernetes.io/ethos-workload.amd64:
          f:node.kubernetes.io/instance-type:
          f:node.kubernetes.io/node-group:
          f:node.kubernetes.io/role:
          f:topology.kubernetes.io/region:
          f:topology.kubernetes.io/zone:
        f:ownerReferences:
          .:
          k:{"uid":"f91d0beb-637d-4dd0-85f0-17469f239371"}:
      f:spec:
        .:
        f:machineTemplateRef:
          .:
          f:name:
        f:requirements:
        f:resources:
          .:
          f:requests:
            .:
            f:cpu:
            f:memory:
            f:pods:
    Manager:      karpenter
    Operation:    Update
    Time:         2023-11-02T01:43:45Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:allocatable:
          .:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
        f:capacity:
          .:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
        f:conditions:
        f:nodeName:
        f:providerID:
    Manager:      karpenter
    Operation:    Update
    Subresource:  status
    Time:         2023-11-02T01:46:56Z
  Owner References:
    API Version:           karpenter.sh/v1alpha5
    Block Owner Deletion:  true
    Kind:                  Provisioner
    Name:                  core-karpenter-ethos-core-karpenter-worker0
    UID:                   f91d0beb-637d-4dd0-85f0-17469f239371
  Resource Version:        71250
  UID:                     2ff95528-47c9-4df7-8bf6-fd3bfd866346
Spec:
  Machine Template Ref:
    Name:  core-karpenter-ethos-core-karpenter-worker0
  Requirements:
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       karpenter.sh/provisioner-name
    Operator:  In
    Values:
      core-karpenter-ethos-core-karpenter-worker0
    Key:       node.kubernetes.io/role
    Operator:  In
    Values:
      worker
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1a
    Key:       ethos.adobe.net/node-templateVersion
    Operator:  In
    Values:
      3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      amd64
    Key:       node.kubernetes.io/ethos-workload.amd64
    Operator:  In
    Values:
      true
    Key:       node.kubernetes.io/node-group
    Operator:  In
    Values:
      worker0-group
    Key:       node.kubernetes.io/instance-type
    Operator:  In
    Values:
      c5.12xlarge
      c5.18xlarge
      c5.24xlarge
      c5.4xlarge
      c5.9xlarge
      c5.metal
      m5.12xlarge
      m5.16xlarge
      m5.24xlarge
      m5.4xlarge
      m5.8xlarge
      m5.metal
      r5.12xlarge
      r5.16xlarge
      r5.4xlarge
      r5.8xlarge
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
  Resources:
    Requests:
      Cpu:     9075m
      Memory:  5085280938
      Pods:    15
Status:
  Allocatable:
    Cpu:                  71750m
    Ephemeral - Storage:  179Gi
    Memory:               127934Mi
    Pods:                 737
  Capacity:
    Cpu:                  72
    Ephemeral - Storage:  200Gi
    Memory:               136396Mi
    Pods:                 737
  Conditions:
    Last Transition Time:  2023-11-02T01:46:56Z
    Status:                True
    Type:                  MachineInitialized
    Last Transition Time:  2023-11-02T01:43:45Z
    Status:                True
    Type:                  MachineLaunched
    Last Transition Time:  2023-11-02T01:46:56Z
    Status:                True
    Type:                  Ready
    Last Transition Time:  2023-11-02T01:46:27Z
    Status:                True
    Type:                  MachineRegistered
  Node Name:               ip-10-10-11-43.ec2.internal
  Provider ID:             aws:///us-east-1a/i-0bb71babdf2055662
Events:
  Type    Reason                 Age                  From       Message
  ----    ------                 ----                 ----       -------
  Normal  DeprovisioningBlocked  81s (x2 over 3m22s)  karpenter  Cannot deprovision Machine: Required label "karpenter.sh/capacity-type" doesn't exist

machine yaml

root@d86e4fa28c12:/infrastructure# kubectl get machine -o yaml core-karpenter-ethos-core-karpenter-worker0-2t4ct
apiVersion: karpenter.sh/v1alpha5
kind: Machine
metadata:
  annotations:
    karpenter.k8s.aws/nodetemplate-hash: "8114642816259925329"
    karpenter.sh/managed-by: karpenter-tmore
    karpenter.sh/provisioner-hash: "8429816451612170557"
  creationTimestamp: "2023-11-02T01:43:42Z"
  finalizers:
  - karpenter.sh/termination
  generateName: core-karpenter-ethos-core-karpenter-worker0-
  generation: 1
  labels:
    ethos.adobe.net/node-templateVersion: 3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
    karpenter.k8s.aws/instance-category: c
    karpenter.k8s.aws/instance-cpu: "72"
    karpenter.k8s.aws/instance-encryption-in-transit-supported: "false"
    karpenter.k8s.aws/instance-family: c5
    karpenter.k8s.aws/instance-generation: "5"
    karpenter.k8s.aws/instance-hypervisor: nitro
    karpenter.k8s.aws/instance-memory: "147456"
    karpenter.k8s.aws/instance-network-bandwidth: "25000"
    karpenter.k8s.aws/instance-pods: "737"
    karpenter.k8s.aws/instance-size: 18xlarge
    karpenter.sh/capacity-type: on-demand
    karpenter.sh/provisioner-name: core-karpenter-ethos-core-karpenter-worker0
    kubernetes.io/arch: amd64
    kubernetes.io/os: linux
    node.kubernetes.io/ethos-workload.amd64: "true"
    node.kubernetes.io/instance-type: c5.18xlarge
    node.kubernetes.io/node-group: worker0-group
    node.kubernetes.io/role: worker
    topology.kubernetes.io/region: us-east-1
    topology.kubernetes.io/zone: us-east-1a
  name: core-karpenter-ethos-core-karpenter-worker0-2t4ct
  ownerReferences:
  - apiVersion: karpenter.sh/v1alpha5
    blockOwnerDeletion: true
    kind: Provisioner
    name: core-karpenter-ethos-core-karpenter-worker0
    uid: f91d0beb-637d-4dd0-85f0-17469f239371
  resourceVersion: "71250"
  uid: 2ff95528-47c9-4df7-8bf6-fd3bfd866346
spec:
  machineTemplateRef:
    name: core-karpenter-ethos-core-karpenter-worker0
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values:
    - "4"
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: karpenter.sh/provisioner-name
    operator: In
    values:
    - core-karpenter-ethos-core-karpenter-worker0
  - key: node.kubernetes.io/role
    operator: In
    values:
    - worker
  - key: topology.kubernetes.io/zone
    operator: In
    values:
    - us-east-1a
  - key: ethos.adobe.net/node-templateVersion
    operator: In
    values:
    - 3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64
  - key: node.kubernetes.io/ethos-workload.amd64
    operator: In
    values:
    - "true"
  - key: node.kubernetes.io/node-group
    operator: In
    values:
    - worker0-group
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - c5.12xlarge
    - c5.18xlarge
    - c5.24xlarge
    - c5.4xlarge
    - c5.9xlarge
    - c5.metal
    - m5.12xlarge
    - m5.16xlarge
    - m5.24xlarge
    - m5.4xlarge
    - m5.8xlarge
    - m5.metal
    - r5.12xlarge
    - r5.16xlarge
    - r5.4xlarge
    - r5.8xlarge
  - key: karpenter.k8s.aws/instance-family
    operator: In
    values:
    - c5
    - c6g
    - m5
    - m6g
    - r5
    - r6g
  - key: kubernetes.io/os
    operator: In
    values:
    - linux
  resources:
    requests:
      cpu: 9075m
      memory: "5085280938"
      pods: "15"
status:
  allocatable:
    cpu: 71750m
    ephemeral-storage: 179Gi
    memory: 127934Mi
    pods: "737"
  capacity:
    cpu: "72"
    ephemeral-storage: 200Gi
    memory: 136396Mi
    pods: "737"
  conditions:
  - lastTransitionTime: "2023-11-02T01:46:56Z"
    status: "True"
    type: MachineInitialized
  - lastTransitionTime: "2023-11-02T01:43:45Z"
    status: "True"
    type: MachineLaunched
  - lastTransitionTime: "2023-11-02T01:46:56Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2023-11-02T01:46:27Z"
    status: "True"
    type: MachineRegistered
  nodeName: ip-10-10-11-43.ec2.internal
  providerID: aws:///us-east-1a/i-0bb71babdf2055662

provisioner describe

root@d86e4fa28c12:/infrastructure# kubectl describe provisioner core-karpenter-ethos-core-karpenter-worker0
Name:         core-karpenter-ethos-core-karpenter-worker0
Namespace:    
Labels:       adobe.com/appname=karpenter-tmoretest-sbx-va6
              app.kubernetes.io/instance=core-karpenter
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ethos-core-karpenter
              app.kubernetes.io/version=v0.31.0
              ethos.adobe.net/managed-by=k8s-infrastructure.helm
              helm.sh/chart=ethos-core-karpenter-0.0.5
Annotations:  karpenter.sh/provisioner-hash: 8429816451612170557
API Version:  karpenter.sh/v1alpha5
Kind:         Provisioner
Metadata:
  Creation Timestamp:  2023-11-02T01:32:32Z
  Generation:          2
  Managed Fields:
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
        f:labels:
          .:
          f:adobe.com/appname:
          f:app.kubernetes.io/instance:
          f:app.kubernetes.io/managed-by:
          f:app.kubernetes.io/name:
          f:app.kubernetes.io/version:
          f:ethos.adobe.net/managed-by:
          f:helm.sh/chart:
      f:spec:
        .:
        f:consolidation:
          .:
          f:enabled:
        f:labels:
          .:
          f:ethos.adobe.net/node-templateVersion:
          f:kubernetes.io/os:
          f:node.kubernetes.io/ethos-workload.amd64:
          f:node.kubernetes.io/node-group:
          f:node.kubernetes.io/role:
        f:limits:
          .:
          f:resources:
            .:
            f:cpu:
            f:memory:
        f:providerRef:
          .:
          f:name:
        f:requirements:
    Manager:      kubectl-client-side-apply
    Operation:    Update
    Time:         2023-11-02T01:32:32Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:karpenter.sh/provisioner-hash:
    Manager:      karpenter
    Operation:    Update
    Time:         2023-11-02T01:32:45Z
    API Version:  karpenter.sh/v1alpha5
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:resources:
          .:
          f:attachable-volumes-aws-ebs:
          f:cpu:
          f:ephemeral-storage:
          f:memory:
          f:pods:
    Manager:         karpenter
    Operation:       Update
    Subresource:     status
    Time:            2023-11-02T01:46:27Z
  Resource Version:  70492
  UID:               f91d0beb-637d-4dd0-85f0-17469f239371
Spec:
  Consolidation:
    Enabled:  true
  Labels:
    ethos.adobe.net/node-templateVersion:     3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
    kubernetes.io/os:                         linux
    node.kubernetes.io/ethos-workload.amd64:  true
    node.kubernetes.io/node-group:            worker0-group
    node.kubernetes.io/role:                  worker
  Limits:
    Resources:
      Cpu:     500
      Memory:  500Gi
  Provider Ref:
    Name:  core-karpenter-ethos-core-karpenter-worker0
  Requirements:
    Key:       karpenter.k8s.aws/instance-family
    Operator:  In
    Values:
      c5
      c6g
      m5
      m6g
      r5
      r6g
    Key:       karpenter.k8s.aws/instance-generation
    Operator:  Gt
    Values:
      4
    Key:       kubernetes.io/arch
    Operator:  In
    Values:
      amd64
    Key:       karpenter.sh/capacity-type
    Operator:  In
    Values:
      on-demand
    Key:       topology.kubernetes.io/zone
    Operator:  In
    Values:
      us-east-1a
      us-east-1b
      us-east-1c
    Key:       kubernetes.io/os
    Operator:  In
    Values:
      linux
Status:
  Resources:
    Attachable - Volumes - Aws - Ebs:  25
    Cpu:                               72
    Ephemeral - Storage:               194314260Ki
    Memory:                            144124784Ki
    Pods:                              112
Events:                                <none>

provisioner yaml

root@d86e4fa28c12:/infrastructure# kubectl get provisioner -o yaml core-karpenter-ethos-core-karpenter-worker0
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  annotations:
    karpenter.sh/provisioner-hash: "8429816451612170557"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"karpenter.sh/v1alpha5","kind":"Provisioner","metadata":{"annotations":{},"labels":{"adobe.com/appname":"karpenter-tmoretest-sbx-va6","app.kubernetes.io/instance":"core-karpenter","app.kubernetes.io/managed-by":"Helm","app.kubernetes.io/name":"ethos-core-karpenter","app.kubernetes.io/version":"v0.31.0","ethos.adobe.net/managed-by":"k8s-infrastructure.helm","helm.sh/chart":"ethos-core-karpenter-0.0.5"},"name":"core-karpenter-ethos-core-karpenter-worker0"},"spec":{"consolidation":{"enabled":true},"labels":{"ethos.adobe.net/node-templateVersion":"3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b","kubernetes.io/os":"linux","node.kubernetes.io/ethos-workload.amd64":"true","node.kubernetes.io/node-group":"worker0-group","node.kubernetes.io/role":"worker"},"limits":{"resources":{"cpu":"500","memory":"500Gi"}},"providerRef":{"name":"core-karpenter-ethos-core-karpenter-worker0"},"requirements":[{"key":"karpenter.k8s.aws/instance-family","operator":"In","values":["c5","c6g","m5","m6g","r5","r6g"]},{"key":"karpenter.k8s.aws/instance-generation","operator":"Gt","values":["4"]},{"key":"kubernetes.io/arch","operator":"In","values":["amd64"]},{"key":"karpenter.sh/capacity-type","operator":"In","values":["on-demand"]},{"key":"topology.kubernetes.io/zone","operator":"In","values":["us-east-1a","us-east-1b","us-east-1c"]}]}}
  creationTimestamp: "2023-11-02T01:32:32Z"
  generation: 2
  labels:
    adobe.com/appname: karpenter-tmoretest-sbx-va6
    app.kubernetes.io/instance: core-karpenter
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ethos-core-karpenter
    app.kubernetes.io/version: v0.31.0
    ethos.adobe.net/managed-by: k8s-infrastructure.helm
    helm.sh/chart: ethos-core-karpenter-0.0.5
  name: core-karpenter-ethos-core-karpenter-worker0
  resourceVersion: "70492"
  uid: f91d0beb-637d-4dd0-85f0-17469f239371
spec:
  consolidation:
    enabled: true
  labels:
    ethos.adobe.net/node-templateVersion: 3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b
    kubernetes.io/os: linux
    node.kubernetes.io/ethos-workload.amd64: "true"
    node.kubernetes.io/node-group: worker0-group
    node.kubernetes.io/role: worker
  limits:
    resources:
      cpu: "500"
      memory: 500Gi
  providerRef:
    name: core-karpenter-ethos-core-karpenter-worker0
  requirements:
  - key: karpenter.k8s.aws/instance-family
    operator: In
    values:
    - c5
    - c6g
    - m5
    - m6g
    - r5
    - r6g
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values:
    - "4"
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  - key: topology.kubernetes.io/zone
    operator: In
    values:
    - us-east-1a
    - us-east-1b
    - us-east-1c
  - key: kubernetes.io/os
    operator: In
    values:
    - linux
status:
  resources:
    attachable-volumes-aws-ebs: "25"
    cpu: "72"
    ephemeral-storage: 194314260Ki
    memory: 144124784Ki
    pods: "112"
tmoreadobe commented 10 months ago

The issue also persists in 0.29.2 version.

  Normal  Starting                 33s                kube-proxy       
  Normal  Starting                 43s                kubelet          Starting kubelet.
  Normal  NodeHasSufficientMemory  43s (x2 over 43s)  kubelet          Node ip-10-10-11-44.ec2.internal status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    43s (x2 over 43s)  kubelet          Node ip-10-10-11-44.ec2.internal status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     43s (x2 over 43s)  kubelet          Node ip-10-10-11-44.ec2.internal status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  43s                kubelet          Updated Node Allocatable limit across pods
  Normal  RegisteredNode           36s                node-controller  Node ip-10-10-11-44.ec2.internal event: Registered Node ip-10-10-11-44.ec2.internal in Controller
  Normal  DeprovisioningBlocked    36s                karpenter        Cannot deprovision node due to required label "karpenter.sh/capacity-type" doesn't exist
  Normal  NodeReady                14s                kubelet          Node ip-10-10-11-44.ec2.internal status is now: NodeReady
tmoreadobe commented 10 months ago

@jmdeal when did the label reordering change for the nodes?

For 0.27.3, capacity-type label is added before initialized label and node is not even ready. With the later versions, that is not happening.

jmdeal commented 10 months ago

It hasn't which is the puzzling part. The label is added via the kubelet so it should exist immediately. Which AMI family are you using?

tmoreadobe commented 10 months ago

I see. It is Flatcar-stable-3510.2.5-arm64-hvm for arm64 nodes and Flatcar-stable-3510.2.5-hvm for amd nodes.

Not sure if it is helpful but with 0.27.3, it takes about 90-110 sec for a node in cluster to go ready while 0.29.2+ versions, it takes 30-35 sec to become ready which is kind of fast. Didn't test releases in between.

jmdeal commented 10 months ago

Ah I'll clarify what I was asking for. I'm looking for the value of the spec.amiFamily field on the AWSNodeTemplate.

tmoreadobe commented 10 months ago

It is this Spec: Ami Family: Custom

tmoreadobe commented 10 months ago

For more context, few nodes do get the label and few dont

tusharmore@Tushars-MacBook-Pro-2 ethos-core-karpenter % kubectl get nodes -l 'karpenter.sh/initialized' | wc -l
      11
tusharmore@Tushars-MacBook-Pro-2 ethos-core-karpenter % kubectl get nodes -l 'karpenter.sh/capacity-type' | wc -l
       3
tusharmore@Tushars-MacBook-Pro-2 ethos-core-karpenter % 
jmdeal commented 10 months ago

I'll have to verify that the kubelet is the only place we add that label, though I believe it is. We don't add labels for Custom AMI Families since the UserData is entirely configured by the AWSNodeTemplate. Are all of these nodes using the same AWSNodeTemplate?

edit: labels should still be synced from the nodeclaim via the registration controller

tmoreadobe commented 10 months ago

They are not using the same AWSNodeTemplate.

tmoreadobe commented 10 months ago

so few nodes launched with same awsnodetemplate have this label and few dont. That is confusing.

tmoreadobe commented 10 months ago

I just checked, we are not installing nodeclaim CRDs. When are these introduced?

jmdeal commented 10 months ago

You shouldn't need the nodeclaim CRD, its part of the v1beta1 API with v0.32.0 and replaces machines. We do convert machines to nodeclaims internally to reduce code duplication but you shouldn't need the CRD. I'm working on trying to repro but still haven't had any luck. It's pretty confusing, the same line that adds the registered label is also responsible for syncing over the other labels for the nodeclaim link. We've successfully marked the node as registered so the patch operation should have been successful, this is definitely weird.

jmdeal commented 10 months ago

Are you able to share your AWSNodeTemplate here to help with repro?

tmoreadobe commented 10 months ago

Here you go

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  annotations:
    karpenter.k8s.aws/nodetemplate-hash: "12888311500295424342"
  creationTimestamp: "2023-11-07T02:32:45Z"
  generation: 1
  labels:
    adobe.com/appname: karpenter-tusharmoretest-sbx-va6
    app.kubernetes.io/instance: core-karpenter
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: ethos-core-karpenter
    app.kubernetes.io/version: v0.31.0
    ethos.adobe.net/managed-by: k8s-infrastructure.helm
    helm.sh/chart: ethos-core-karpenter-0.0.8
  name: core-karpenter-ethos-core-karpenter-workerc512xlarge
  resourceVersion: "173080"
  uid: db652467-c653-45fd-ab82-8c02a85ac4f8
spec:
  amiFamily: Custom
  amiSelector:
    aws::name: Flatcar-*-3510.2.5-hvm
    aws::owners: <owners>
  blockDeviceMappings:
  - deviceName: /dev/xvda
    ebs:
      deleteOnTermination: true
      encrypted: true
      volumeSize: 200Gi
      volumeType: gp3
  instanceProfile: tusharmoretest-sbx-va6-k8s-eks-worker-profile
  metadataOptions:
    httpEndpoint: enabled
    httpProtocolIPv6: disabled
    httpPutResponseHopLimit: 1
    httpTokens: optional
  securityGroupSelector:
    aws-ids: sg-06f063b72d22dcfa7,sg-0f4cbb70128e7704f,sg-0c4e3975ed728170e,sg-08a5650ba698b2ccf,sg-09b4140aba76d7e62,sg-0690aeeee15fef24f,sg-03553b91e2cdf7d39,sg-0072424161cff41d6,sg-06be4785295592015,sg-00acc95087637f19b,sg-0c3894876988d18a2,sg-0e203c88894a31403,sg-033a992a10f30a354
  subnetSelector:
    aws-ids: subnet-006f1591595d243e5,subnet-089e649a829fcbdf3,subnet-0b00a2c51cbd45a80
  tags:
    CMDB_environment: None
    FlatcarVersion: 3510.2.5
    KubernetesCluster: karpenter-tusharmore
    Name: tusharmoretest-sbx-va6-k8s-workerc512xlarge-0
    ethos.adobe.net/cluster/karpenter-tusharmore: owned
    k8s.io/cluster-autoscaler/enabled: "false"
    k8s.io/cluster-autoscaler/node-template/label/kubernetes.io/os: linux
    k8s.io/cluster-autoscaler/node-template/label/node.kubernetes.io/ethos-workload.amd64: "true"
    k8s.io/cluster-autoscaler/node-template/label/topology.ebs.csi.aws.com/zone: ""
    k8s.io/cluster-autoscaler/node-template/resources/ephemeral-storage: 200Gi
    k8s.io/cluster-autoscaler/node-template/taint/ethos.corp.adobe.com/ethos-workload: amd64:NoSchedule
    k8s.io/karpenter/enabled: "true"
    product: tusharmoretest
    role: worker
    worker_type: worker
  userData: |
    {"ignition": {"version": "2.2.0", "config": {"replace": {"source": "https://s3.amazonaws.com/test", "verification": {"hash": "sha512-somesha"}}}}}
status:
  amis:
  - id: ami-03a7f10c456f31936
    name: Flatcar-stable-3510.2.5-arm64-hvm
    requirements:
    - key: kubernetes.io/arch
      operator: In
      values:
      - arm64
  securityGroups:
  - id: sg-0072424161cff41d6
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker7-1E77M4T5SZD9R
  - id: sg-00acc95087637f19b
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker9-21Q56IA09VN5
  - id: sg-033a992a10f30a354
    name: karpenter-tusharmore-SGEKSWorkerToWorker-3T1EK95R8DBW
  - id: sg-03553b91e2cdf7d39
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker6-1J14BLF1RW30F
  - id: sg-0690aeeee15fef24f
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker5-EA3WSB5EBOPV
  - id: sg-06be4785295592015
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker8-1PL4JHOX482HP
  - id: sg-06f063b72d22dcfa7
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker-DZTEFJP7HRQF
  - id: sg-08a5650ba698b2ccf
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker3-1792NFHG8UCN4
  - id: sg-09b4140aba76d7e62
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker4-17QMLS989UD1W
  - id: sg-0c3894876988d18a2
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker10-KN99B233P5AB
  - id: sg-0c4e3975ed728170e
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker2-83BSRXH2J84V
  - id: sg-0e203c88894a31403
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SAll-P90GL2R2K3HT
  - id: sg-0f4cbb70128e7704f
    name: tusharmoretest-sbx-va6-k8s-compute-0-SGK8SWorker1-1WLQAQYWCD0IB
  subnets:
  - id: subnet-006f1591595d243e5
    zone: us-east-1a
  - id: subnet-089e649a829fcbdf3
    zone: us-east-1b
  - id: subnet-0b00a2c51cbd45a80
    zone: us-east-1c
jmdeal commented 10 months ago

Everything seems fine here, I'm still not able to recreate the issue. Do you have access to your API server audit logs? I'd like to verify that these labels were not present in the patch operation as part of node registration.

tmoreadobe commented 10 months ago

Yes I have access to the audit logs. So for one of the node in patch operation this happened

    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "109598",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tusharmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {},
                            "f:taints": {}
                        }
                    }
                },

but machine has the labels

apiVersion: karpenter.sh/v1alpha5
kind: Machine
metadata:
  annotations:
    karpenter.k8s.aws/nodetemplate-hash: "9451254827736767782"
    karpenter.sh/managed-by: karpenter-tusharmore
    karpenter.sh/provisioner-hash: "5643188492113361487"
  creationTimestamp: "2023-11-07T02:52:55Z"
  finalizers:
  - karpenter.sh/termination
  generateName: core-karpenter-ethos-core-karpenter-armc6g4xlarge-
  generation: 1
  labels:
    ethos.adobe.net/node-templateVersion: 8dff9db0cf7279d779d8c7e3d1737b07bd54803e65357871d9fe09d441bc12
    karpenter.k8s.aws/instance-category: c
    karpenter.k8s.aws/instance-cpu: "64"
    karpenter.k8s.aws/instance-encryption-in-transit-supported: "false"
    karpenter.k8s.aws/instance-family: c6g
    karpenter.k8s.aws/instance-generation: "6"
    karpenter.k8s.aws/instance-hypervisor: nitro
    karpenter.k8s.aws/instance-memory: "131072"
    karpenter.k8s.aws/instance-network-bandwidth: "25000"
    karpenter.k8s.aws/instance-pods: "737"
    karpenter.k8s.aws/instance-size: 16xlarge
    karpenter.sh/capacity-type: on-demand
    karpenter.sh/provisioner-name: core-karpenter-ethos-core-karpenter-armc6g4xlarge
    kubernetes.io/arch: arm64
    kubernetes.io/os: linux
    node.kubernetes.io/ethos-workload.arm64: "true"
    node.kubernetes.io/instance-type: c6g.16xlarge
    node.kubernetes.io/role: worker
    topology.kubernetes.io/region: us-east-1
    topology.kubernetes.io/zone: us-east-1a
  name: core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r
  ownerReferences:
  - apiVersion: karpenter.sh/v1alpha5
    blockOwnerDeletion: true
    kind: Provisioner
    name: core-karpenter-ethos-core-karpenter-armc6g4xlarge
    uid: c61e4dc6-856a-4636-8f1a-a28123f0778e
  resourceVersion: "110788"
  uid: e287ccbe-a808-4a5f-ae38-d03d5d7efc46
spec:
  machineTemplateRef:
    name: core-karpenter-ethos-core-karpenter-armc6g4xlarge
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: Gt
    values:
    - "4"
  - key: topology.kubernetes.io/zone
    operator: In
    values:
    - us-east-1a
  - key: node.kubernetes.io/role
    operator: In
    values:
    - worker
  - key: ethos.adobe.net/node-templateVersion
    operator: In
    values:
    - 8dff9db0cf7279d779d8c7e3d1737b07bd54803e65357871d9fe09d441bc12
  - key: kubernetes.io/os
    operator: In
    values:
    - linux
  - key: node.kubernetes.io/ethos-workload.arm64
    operator: In
    values:
    - "true"
  - key: karpenter.sh/provisioner-name
    operator: In
    values:
    - core-karpenter-ethos-core-karpenter-armc6g4xlarge
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - c6g.12xlarge
    - c6g.16xlarge
    - c6g.2xlarge
    - c6g.4xlarge
    - c6g.8xlarge
    - c6g.metal
    - c6g.xlarge
    - m6g.12xlarge
    - m6g.16xlarge
    - m6g.2xlarge
    - m6g.4xlarge
    - m6g.8xlarge
    - m6g.large
    - m6g.metal
    - m6g.xlarge
    - r6g.12xlarge
    - r6g.16xlarge
    - r6g.2xlarge
    - r6g.4xlarge
    - r6g.8xlarge
    - r6g.large
    - r6g.metal
    - r6g.xlarge
  - key: karpenter.k8s.aws/instance-family
    operator: In
    values:
    - c5
    - c6g
    - m5
    - m6g
    - r5
    - r6g
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64
  - key: karpenter.sh/capacity-type
    operator: In
    values:
    - on-demand
  resources:
    requests:
      cpu: 1799m
      memory: "4072356522"
      pods: "15"
  taints:
  - effect: NoSchedule
    key: ethos.corp.adobe.com/ethos-workload
    value: arm64
status:
  allocatable:
    cpu: 63770m
    ephemeral-storage: 179Gi
    memory: 112720Mi
    pods: "737"
  capacity:
    cpu: "64"
    ephemeral-storage: 200Gi
    memory: 121182Mi
    pods: "737"
  conditions:
  - lastTransitionTime: "2023-11-07T02:54:50Z"
    status: "True"
    type: MachineInitialized
  - lastTransitionTime: "2023-11-07T02:53:00Z"
    status: "True"
    type: MachineLaunched
  - lastTransitionTime: "2023-11-07T02:54:50Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2023-11-07T02:54:21Z"
    status: "True"
    type: MachineRegistered
  nodeName: ip-10-10-11-197.ec2.internal
  providerID: aws:///us-east-1a/i-00b70341f59a5bf3b
jmdeal commented 10 months ago

I know you've done this with a few different Karpenter versions, which version are you currently using?

tmoreadobe commented 10 months ago

The above one is with 0.31.0 although I had tried 0.32.1 and issue still persists. Also I would like to mention that it works fine with 0.27.6 and 0.27.3 version from which the latter is what we are using now for testing.

jmdeal commented 10 months ago

Karpenter migrated to using machines in v0.28.x rather than creating / managing node objects directly so its likely your issue has something to do with that migration. Going back to the audit log, is there any patch operation which adds the karpenter.sh/registered tag? There should be an event with a requestObject that looks something like this (from my attempt to repro). The labels should match those on your machine:

"requestObject": {
    "metadata": {
        "annotations": {
            "karpenter.sh/managed-by": "jmdeal-dev"
        },
        "finalizers": [
            "karpenter.sh/termination"
        ],
        "labels": {
            "ethos.adobe.net/node-templateVersion": "3dabd57c8e2719ab162090e709b9db499e66d3b5fb49f89bc2bc822929d06b",
            "karpenter.k8s.aws/instance-category": "c",
            "karpenter.k8s.aws/instance-cpu": "4",
            "karpenter.k8s.aws/instance-encryption-in-transit-supported": "false",
            "karpenter.k8s.aws/instance-family": "c5",
            "karpenter.k8s.aws/instance-generation": "5",
            "karpenter.k8s.aws/instance-hypervisor": "nitro",
            "karpenter.k8s.aws/instance-memory": "8192",
            "karpenter.k8s.aws/instance-network-bandwidth": "1250",
            "karpenter.k8s.aws/instance-pods": "58",
            "karpenter.k8s.aws/instance-size": "xlarge",
            "karpenter.sh/capacity-type": "on-demand",
            "karpenter.sh/provisioner-name": "core-karpenter-ethos-core-karpenter-worker0",
            "karpenter.sh/registered": "true",
            "node.kubernetes.io/ethos-workload.amd64": "true",
            "node.kubernetes.io/instance-type": "c5.xlarge",
            "node.kubernetes.io/node-group": "worker0-group",
            "node.kubernetes.io/role": "worker",
            "topology.kubernetes.io/region": "us-west-2",
            "topology.kubernetes.io/zone": "us-west-2c"
        },
        "ownerReferences": [
            {
                "apiVersion": "karpenter.sh/v1alpha5",
                "blockOwnerDeletion": true,
                "kind": "Machine",
                "name": "core-karpenter-ethos-core-karpenter-worker0-fmf9j",
                "uid": "8be00f56-ad06-433a-b959-48a1df5044a5"
            }
        ]
    }
}

Also could you provide logs that capture the full node creation lifecycle (i.e. from when the node is created to when it is initialized). I'm specifically looking for an error containing "syncing node labels".

jmdeal commented 10 months ago

On second thought, if possible could we also get all of audit logs for a node that doesn't have the labels? I'd like to rule out any other process updating or overwriting the labels.

tmoreadobe commented 10 months ago

With the search I only see these two events

{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "RequestResponse",
    "auditID": "55e362c9-a442-4a06-a900-a0b7ee036ff3",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal?fieldManager=kubectl-annotate",
    "verb": "patch",
    "user": {
        "username": "system:node:ip-10-10-11-197.ec2.internal",
        "uid": "aws-iam-authenticator:258057316678:AROATYFLKFFDDT3RGILBU",
        "groups": [
            "system:bootstrappers",
            "system:nodes",
            "system:authenticated"
        ],
        "extra": {
            "accessKeyId": [
                "XXXXXXXXXXXXX"
            ],
            "arn": [
                "arn:aws:sts::258057316678:assumed-role/tusharmoretest-sbx-va6-k8s-network-IAMRoleEKSWorkerRole-Fr5Zy96hnbUA/i-xxxxxxxxxx"
            ],
            "canonicalArn": [
                "arn:aws:iam::258057316678:role/tusharmoretest-sbx-va6-k8s-network-xxxxxxxxx"
            ],
            "principalId": [
                "xxxxxxxxxxxxxxxx"
            ],
            "sessionName": [
                "xxxxxxxxxxxx"
            ]
        }
    },
    "sourceIPs": [
        "10.10.11.197"
    ],
    "userAgent": "kubectl/v1.26.9 (linux/arm64) kubernetes/d1483fd",
    "objectRef": {
        "resource": "nodes",
        "name": "ip-10-10-11-197.ec2.internal",
        "apiVersion": "v1"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestObject": {
        "metadata": {
            "annotations": {
                "io.cilium.network.ipv4-pod-cidr": "xxxxxx/24"
            }
        }
    },
    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "110690",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "karpenter.sh/registered": "true",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "io.cilium.network.ipv4-pod-cidr": "xxxxxxxxxx",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tusharmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {},
                            "f:taints": {}
                        }
                    }
                },
                {
                    "manager": "label-maker",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:node-role.kubernetes.io/worker": {}
                            }
                        }
                    }
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:22Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:labels": {
                                "f:karpenter.sh/registered": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:42Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:status": {
                            "f:conditions": {
                                "k:{\"type\":\"DiskPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"MemoryPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"PIDPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"Ready\"}": {
                                    "f:lastHeartbeatTime": {},
                                    "f:message": {}
                                }
                            }
                        }
                    },
                    "subresource": "status"
                },
                {
                    "manager": "kubectl-annotate",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:48Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:io.cilium.network.ipv4-pod-cidr": {}
                            }
                        }
                    }
                }
            ]
        },
        "spec": {
            "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
            "taints": [
                {
                    "key": "ethos.corp.adobe.com/ethos-workload",
                    "value": "arm64",
                    "effect": "NoSchedule"
                },
                {
                    "key": "node.kubernetes.io/not-ready",
                    "effect": "NoSchedule"
                }
            ]
        },
        "status": {
            "capacity": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "64",
                "ephemeral-storage": "194314260Ki",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "129565136Ki",
                "pods": "112"
            },
            "allocatable": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "62770m",
                "ephemeral-storage": "176798320344",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "126481872Ki",
                "pods": "112"
            },
            "conditions": [
                {
                    "type": "MemoryPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:42Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientMemory",
                    "message": "kubelet has sufficient memory available"
                },
                {
                    "type": "DiskPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:42Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasNoDiskPressure",
                    "message": "kubelet has no disk pressure"
                },
                {
                    "type": "PIDPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:42Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientPID",
                    "message": "kubelet has sufficient PID available"
                },
                {
                    "type": "Ready",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:42Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletNotReady",
                    "message": "container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?"
                }
            ],
            "addresses": [
                {
                    "type": "InternalIP",
                    "address": "10.10.11.197"
                },
                {
                    "type": "Hostname",
                    "address": "ip-10-10-11-197.ec2.internal"
                },
                {
                    "type": "InternalDNS",
                    "address": "ip-10-10-11-197.ec2.internal"
                }
            ],
            "daemonEndpoints": {
                "kubeletEndpoint": {
                    "Port": 10250
                }
            },
            "nodeInfo": {
                "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                "kernelVersion": "5.15.119-flatcar",
                "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                "containerRuntimeVersion": "cri-o://1.26.4",
                "kubeletVersion": "v1.26.9",
                "kubeProxyVersion": "v1.26.9",
                "operatingSystem": "linux",
                "architecture": "arm64"
            }
        }
    },
    "requestReceivedTimestamp": "2023-11-07T02:54:48.140075Z",
    "stageTimestamp": "2023-11-07T02:54:48.152676Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": ""
    }
}
{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "RequestResponse",
    "auditID": "75ff5971-51c7-40f1-8f63-9ccd7c6aca2a",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal/status",
    "verb": "patch",
    "user": {
        "username": "system:serviceaccount:kube-system:cilium",
        "uid": "4151ab92-523b-4112-9e8c-fb8dbcf3471b",
        "groups": [
            "system:serviceaccounts",
            "system:serviceaccounts:kube-system",
            "system:authenticated"
        ],
        "extra": {
            "authentication.kubernetes.io/pod-name": [
                "cilium-cb87q"
            ],
            "authentication.kubernetes.io/pod-uid": [
                "7123437a-92c0-47d0-868c-c420b37f93cc"
            ]
        }
    },
    "sourceIPs": [
        "10.10.11.197"
    ],
    "userAgent": "cilium-agent/1.12.14-cee.1 e38bfc58 2023-10-10T18:21:16+00:00 go version go1.20.8 linux/arm64",
    "objectRef": {
        "resource": "nodes",
        "name": "ip-10-10-11-197.ec2.internal",
        "apiVersion": "v1",
        "subresource": "status"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestObject": {
        "metadata": {
            "annotations": {
                "io.cilium.network.ipv4-cilium-host": "xxxxxxxx",
                "io.cilium.network.ipv4-pod-cidr": "xxxxxxxxxx"
            }
        }
    },
    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "110888",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "karpenter.sh/initialized": "true",
                "karpenter.sh/registered": "true",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "io.cilium.network.ipv4-cilium-host": "xxxxxxxxxx",
                "io.cilium.network.ipv4-pod-cidr": "xxxxxxxxxxxx",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tusharmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {}
                        }
                    }
                },
                {
                    "manager": "label-maker",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:node-role.kubernetes.io/worker": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubectl-annotate",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:48Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:io.cilium.network.ipv4-pod-cidr": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:49Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        },
                        "f:spec": {
                            "f:taints": {}
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:49Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:status": {
                            "f:conditions": {
                                "k:{\"type\":\"DiskPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"MemoryPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"PIDPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"Ready\"}": {
                                    "f:lastHeartbeatTime": {},
                                    "f:lastTransitionTime": {},
                                    "f:message": {},
                                    "f:reason": {},
                                    "f:status": {}
                                }
                            },
                            "f:images": {}
                        }
                    },
                    "subresource": "status"
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:50Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:labels": {
                                "f:karpenter.sh/initialized": {},
                                "f:karpenter.sh/registered": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "cilium-agent",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:55Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:io.cilium.network.ipv4-cilium-host": {}
                            }
                        }
                    },
                    "subresource": "status"
                }
            ]
        },
        "spec": {
            "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
            "taints": [
                {
                    "key": "ethos.corp.adobe.com/ethos-workload",
                    "value": "arm64",
                    "effect": "NoSchedule"
                }
            ]
        },
        "status": {
            "capacity": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "64",
                "ephemeral-storage": "194314260Ki",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "129565136Ki",
                "pods": "112"
            },
            "allocatable": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "62770m",
                "ephemeral-storage": "176798320344",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "126481872Ki",
                "pods": "112"
            },
            "conditions": [
                {
                    "type": "MemoryPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientMemory",
                    "message": "kubelet has sufficient memory available"
                },
                {
                    "type": "DiskPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasNoDiskPressure",
                    "message": "kubelet has no disk pressure"
                },
                {
                    "type": "PIDPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientPID",
                    "message": "kubelet has sufficient PID available"
                },
                {
                    "type": "Ready",
                    "status": "True",
                    "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                    "lastTransitionTime": "2023-11-07T02:54:49Z",
                    "reason": "KubeletReady",
                    "message": "kubelet is posting ready status"
                }
            ],
            "addresses": [
                {
                    "type": "InternalIP",
                    "address": "10.10.11.197"
                },
                {
                    "type": "Hostname",
                    "address": "ip-10-10-11-197.ec2.internal"
                },
                {
                    "type": "InternalDNS",
                    "address": "ip-10-10-11-197.ec2.internal"
                }
            ],
            "daemonEndpoints": {
                "kubeletEndpoint": {
                    "Port": 10250
                }
            },
            "nodeInfo": {
                "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                "kernelVersion": "5.15.119-flatcar",
                "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                "containerRuntimeVersion": "cri-o://1.26.4",
                "kubeletVersion": "v1.26.9",
                "kubeProxyVersion": "v1.26.9",
                "operatingSystem": "linux",
                "architecture": "arm64"
            },
            "images": [
                {
                    "names": [
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit@sha256:4b42a45ac64af18c262485817d3e634cd078202d83ebb034e8b3c13f50906694",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit@sha256:9b044ba4f8c8f93f6c4068ac68ae624f3a23aba4cc9f5f3337973d5a164d5b6f",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit:1.9.1.2-upstream"
                    ],
                    "sizeBytes": 669628779
                },
                {
                    "names": [
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:24d76a5d635f48f290fb1228bacbaf50d6f57911bbb1b9e4317e31483c10d518",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:d738a5f7e4f7d8c0b761d50ff527775b5f685ba0176281d9d6569af00f7bb7bb",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev:v1.12.14-cee.1-877"
                    ],
                    "sizeBytes": 463354425
                },
                {
                    "names": [
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:35e03cd57eaf0fa41a3e390e85faf91bedd04776be227176f24ce01779066d58",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:d8c8e3e8fe630c3f2d84a22722d4891343196483ac4cc02c1ba9345b1bfc8a3d",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy:v1.26.9"
                    ],
                    "sizeBytes": 63511779
                },
                {
                    "names": [
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:280f2d0f44eb284ff58a2c41c59204f3ac288cb4d96e5dac48e3026fc6d431e9",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:c022c72c9de525ad5683f58831b5e12eaf885906e6e225183193c1c42db896a2",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy:v1.12.8"
                    ],
                    "sizeBytes": 49445029
                },
                {
                    "names": [
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:454aa0877745e5cca6fcb76aee4192b695f7a8cb28bac4c2ad0faa392e421df4",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:e1d442546e868db1a3289166c14011e0dbd32115b338b963e56f830972bc22a2",
                        "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script:62093c5c233ea914bfa26a10ba41f8780d9b737f"
                    ],
                    "sizeBytes": 25322153
                },
                {
                    "names": [
                        "registry.k8s.io/pause@sha256:3ec98b8452dc8ae265a6917dfb81587ac78849e520d5dbba6de524851d20eca6",
                        "registry.k8s.io/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097",
                        "registry.k8s.io/pause:3.9"
                    ],
                    "sizeBytes": 520014
                }
            ]
        }
    },
    "requestReceivedTimestamp": "2023-11-07T02:54:55.986502Z",
    "stageTimestamp": "2023-11-07T02:54:56.006166Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cilium\" of ClusterRole \"cilium\" to ServiceAccount \"cilium/kube-system\""
    }
}
jmdeal commented 10 months ago

Hmm, there definitely should be other events captured for the node. We can see quite a few more reflected in the nodes managed fields. Also it doesn't look like the first event you posted was one of these two. When you run the query, what are you searching for?

tmoreadobe commented 10 months ago

I am searching for karpenter.sh/registered and only see two events.

tmoreadobe commented 10 months ago

Do you have a sample query which I can run and give for this particular node ip-10-10-11-197.ec2.internal and give the data you want? Since when searching with hostname and ip it is returning lot of data which is not relevant.

jmdeal commented 10 months ago

I searched for any patch events on the node by the Karpenter user agent.

tmoreadobe commented 10 months ago

Here are all the events for patch and karpenter

{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "RequestResponse",
    "auditID": "6d5a6a66-30a9-4adb-a677-176345500b8c",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
    "verb": "patch",
    "user": {
        "username": "system:serviceaccount:kube-system:tagging-controller",
        "uid": "58c3d345-fa12-4205-b7eb-b7ed4d9e3670",
        "groups": [
            "system:serviceaccounts",
            "system:serviceaccounts:kube-system",
            "system:authenticated"
        ]
    },
    "sourceIPs": [
        "172.16.102.100"
    ],
    "userAgent": "aws-cloud-controller-manager/v0.0.0 (linux/amd64) kubernetes/$Format/system:serviceaccount:kube-system:tagging-controller",
    "objectRef": {
        "resource": "nodes",
        "name": "ip-10-10-11-197.ec2.internal",
        "apiVersion": "v1"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestObject": {
        "metadata": {
            "labels": {
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a"
            }
        }
    },
    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "109598",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {},
                            "f:taints": {}
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:status": {
                            "f:conditions": {
                                "k:{\"type\":\"DiskPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"MemoryPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"PIDPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"Ready\"}": {
                                    "f:lastHeartbeatTime": {}
                                }
                            }
                        }
                    },
                    "subresource": "status"
                },
                {
                    "manager": "label-maker",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:node-role.kubernetes.io/worker": {}
                            }
                        }
                    }
                }
            ]
        },
        "spec": {
            "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
            "taints": [
                {
                    "key": "ethos.corp.adobe.com/ethos-workload",
                    "value": "arm64",
                    "effect": "NoSchedule"
                },
                {
                    "key": "node.kubernetes.io/not-ready",
                    "effect": "NoSchedule"
                }
            ]
        },
        "status": {
            "capacity": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "64",
                "ephemeral-storage": "194314260Ki",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "129565136Ki",
                "pods": "112"
            },
            "allocatable": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "62770m",
                "ephemeral-storage": "176798320344",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "126481872Ki",
                "pods": "112"
            },
            "conditions": [
                {
                    "type": "MemoryPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientMemory",
                    "message": "kubelet has sufficient memory available"
                },
                {
                    "type": "DiskPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasNoDiskPressure",
                    "message": "kubelet has no disk pressure"
                },
                {
                    "type": "PIDPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientPID",
                    "message": "kubelet has sufficient PID available"
                },
                {
                    "type": "Ready",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletNotReady",
                    "message": "[container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?, CSINode is not yet initialized]"
                }
            ],
            "addresses": [
                {
                    "type": "InternalIP",
                    "address": "10.10.11.197"
                },
                {
                    "type": "Hostname",
                    "address": "ip-10-10-11-197.ec2.internal"
                },
                {
                    "type": "InternalDNS",
                    "address": "ip-10-10-11-197.ec2.internal"
                }
            ],
            "daemonEndpoints": {
                "kubeletEndpoint": {
                    "Port": 10250
                }
            },
            "nodeInfo": {
                "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                "kernelVersion": "5.15.119-flatcar",
                "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                "containerRuntimeVersion": "cri-o://1.26.4",
                "kubeletVersion": "v1.26.9",
                "kubeProxyVersion": "v1.26.9",
                "operatingSystem": "linux",
                "architecture": "arm64"
            }
        }
    },
    "requestReceivedTimestamp": "2023-11-07T02:54:21.971826Z",
    "stageTimestamp": "2023-11-07T02:54:22.017060Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"eks:tagging-controller\" of ClusterRole \"eks:tagging-controller\" to ServiceAccount \"tagging-controller/kube-system\""
    }
}
{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "RequestResponse",
    "auditID": "b235a977-5288-435e-8b55-fa7f4a19684e",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
    "verb": "patch",
    "user": {
        "username": "system:serviceaccount:kube-system:tagging-controller",
        "uid": "58c3d345-fa12-4205-b7eb-b7ed4d9e3670",
        "groups": [
            "system:serviceaccounts",
            "system:serviceaccounts:kube-system",
            "system:authenticated"
        ]
    },
    "sourceIPs": [
        "172.16.102.100"
    ],
    "userAgent": "aws-cloud-controller-manager/v0.0.0 (linux/amd64) kubernetes/$Format/system:serviceaccount:kube-system:tagging-controller",
    "objectRef": {
        "resource": "nodes",
        "name": "ip-10-10-11-197.ec2.internal",
        "apiVersion": "v1"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestObject": {},
    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "109598",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {},
                            "f:taints": {}
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:status": {
                            "f:conditions": {
                                "k:{\"type\":\"DiskPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"MemoryPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"PIDPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"Ready\"}": {
                                    "f:lastHeartbeatTime": {}
                                }
                            }
                        }
                    },
                    "subresource": "status"
                },
                {
                    "manager": "label-maker",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:node-role.kubernetes.io/worker": {}
                            }
                        }
                    }
                }
            ]
        },
        "spec": {
            "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
            "taints": [
                {
                    "key": "ethos.corp.adobe.com/ethos-workload",
                    "value": "arm64",
                    "effect": "NoSchedule"
                },
                {
                    "key": "node.kubernetes.io/not-ready",
                    "effect": "NoSchedule"
                }
            ]
        },
        "status": {
            "capacity": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "64",
                "ephemeral-storage": "194314260Ki",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "129565136Ki",
                "pods": "112"
            },
            "allocatable": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "62770m",
                "ephemeral-storage": "176798320344",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "126481872Ki",
                "pods": "112"
            },
            "conditions": [
                {
                    "type": "MemoryPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientMemory",
                    "message": "kubelet has sufficient memory available"
                },
                {
                    "type": "DiskPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasNoDiskPressure",
                    "message": "kubelet has no disk pressure"
                },
                {
                    "type": "PIDPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientPID",
                    "message": "kubelet has sufficient PID available"
                },
                {
                    "type": "Ready",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletNotReady",
                    "message": "[container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?, CSINode is not yet initialized]"
                }
            ],
            "addresses": [
                {
                    "type": "InternalIP",
                    "address": "10.10.11.197"
                },
                {
                    "type": "Hostname",
                    "address": "ip-10-10-11-197.ec2.internal"
                },
                {
                    "type": "InternalDNS",
                    "address": "ip-10-10-11-197.ec2.internal"
                }
            ],
            "daemonEndpoints": {
                "kubeletEndpoint": {
                    "Port": 10250
                }
            },
            "nodeInfo": {
                "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                "kernelVersion": "5.15.119-flatcar",
                "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                "containerRuntimeVersion": "cri-o://1.26.4",
                "kubeletVersion": "v1.26.9",
                "kubeProxyVersion": "v1.26.9",
                "operatingSystem": "linux",
                "architecture": "arm64"
            }
        }
    },
    "requestReceivedTimestamp": "2023-11-07T02:54:22.172853Z",
    "stageTimestamp": "2023-11-07T02:54:22.180438Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"eks:tagging-controller\" of ClusterRole \"eks:tagging-controller\" to ServiceAccount \"tagging-controller/kube-system\""
    }
}
{
    "kind": "Event",
    "apiVersion": "audit.k8s.io/v1",
    "level": "RequestResponse",
    "auditID": "d826d4c1-5690-4bf3-b498-0143b666e3eb",
    "stage": "ResponseComplete",
    "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
    "verb": "patch",
    "user": {
        "username": "system:serviceaccount:kube-system:tagging-controller",
        "uid": "58c3d345-fa12-4205-b7eb-b7ed4d9e3670",
        "groups": [
            "system:serviceaccounts",
            "system:serviceaccounts:kube-system",
            "system:authenticated"
        ]
    },
    "sourceIPs": [
        "172.16.102.100"
    ],
    "userAgent": "aws-cloud-controller-manager/v0.0.0 (linux/amd64) kubernetes/$Format/system:serviceaccount:kube-system:tagging-controller",
    "objectRef": {
        "resource": "nodes",
        "name": "ip-10-10-11-197.ec2.internal",
        "apiVersion": "v1"
    },
    "responseStatus": {
        "metadata": {},
        "code": 200
    },
    "requestObject": {},
    "responseObject": {
        "kind": "Node",
        "apiVersion": "v1",
        "metadata": {
            "name": "ip-10-10-11-197.ec2.internal",
            "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
            "resourceVersion": "109598",
            "creationTimestamp": "2023-11-07T02:54:21Z",
            "labels": {
                "beta.kubernetes.io/arch": "arm64",
                "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                "beta.kubernetes.io/os": "linux",
                "failure-domain.beta.kubernetes.io/region": "us-east-1",
                "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                "kubernetes.io/arch": "arm64",
                "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                "kubernetes.io/os": "linux",
                "node-role.kubernetes.io/worker": "",
                "node.kubernetes.io/container-runtime": "cri-o",
                "node.kubernetes.io/ethos-workload.arm64": "true",
                "node.kubernetes.io/instance-lifecycle": "normal",
                "node.kubernetes.io/instance-type": "c6g.16xlarge",
                "node.kubernetes.io/role": "worker",
                "topology.kubernetes.io/region": "us-east-1",
                "topology.kubernetes.io/zone": "us-east-1a"
            },
            "annotations": {
                "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                "karpenter.sh/managed-by": "karpenter-tmore",
                "karpenter.sh/provisioner-hash": "5643188492113361487",
                "node.alpha.kubernetes.io/ttl": "0",
                "volumes.kubernetes.io/controller-managed-attach-detach": "true"
            },
            "ownerReferences": [
                {
                    "apiVersion": "karpenter.sh/v1alpha5",
                    "kind": "Machine",
                    "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                    "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                    "blockOwnerDeletion": true
                }
            ],
            "finalizers": [
                "karpenter.sh/termination"
            ],
            "managedFields": [
                {
                    "manager": "aws-cloud-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:k8s.io/cloud-provider-aws": {}
                            }
                        }
                    }
                },
                {
                    "manager": "karpenter",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                "f:karpenter.sh/managed-by": {},
                                "f:karpenter.sh/provisioner-hash": {}
                            },
                            "f:finalizers": {
                                ".": {},
                                "v:\"karpenter.sh/termination\"": {}
                            },
                            "f:ownerReferences": {
                                ".": {},
                                "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kube-controller-manager",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                "f:node.alpha.kubernetes.io/ttl": {}
                            }
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:annotations": {
                                ".": {},
                                "f:alpha.kubernetes.io/provided-node-ip": {},
                                "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                            },
                            "f:labels": {
                                ".": {},
                                "f:beta.kubernetes.io/arch": {},
                                "f:beta.kubernetes.io/instance-type": {},
                                "f:beta.kubernetes.io/os": {},
                                "f:failure-domain.beta.kubernetes.io/region": {},
                                "f:failure-domain.beta.kubernetes.io/zone": {},
                                "f:kubernetes.io/arch": {},
                                "f:kubernetes.io/hostname": {},
                                "f:kubernetes.io/os": {},
                                "f:node.kubernetes.io/container-runtime": {},
                                "f:node.kubernetes.io/ethos-workload.arm64": {},
                                "f:node.kubernetes.io/instance-lifecycle": {},
                                "f:node.kubernetes.io/instance-type": {},
                                "f:node.kubernetes.io/role": {},
                                "f:topology.kubernetes.io/region": {},
                                "f:topology.kubernetes.io/zone": {}
                            }
                        },
                        "f:spec": {
                            "f:providerID": {},
                            "f:taints": {}
                        }
                    }
                },
                {
                    "manager": "kubelet",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:status": {
                            "f:conditions": {
                                "k:{\"type\":\"DiskPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"MemoryPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"PIDPressure\"}": {
                                    "f:lastHeartbeatTime": {}
                                },
                                "k:{\"type\":\"Ready\"}": {
                                    "f:lastHeartbeatTime": {}
                                }
                            }
                        }
                    },
                    "subresource": "status"
                },
                {
                    "manager": "label-maker",
                    "operation": "Update",
                    "apiVersion": "v1",
                    "time": "2023-11-07T02:54:21Z",
                    "fieldsType": "FieldsV1",
                    "fieldsV1": {
                        "f:metadata": {
                            "f:labels": {
                                "f:node-role.kubernetes.io/worker": {}
                            }
                        }
                    }
                }
            ]
        },
        "spec": {
            "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
            "taints": [
                {
                    "key": "ethos.corp.adobe.com/ethos-workload",
                    "value": "arm64",
                    "effect": "NoSchedule"
                },
                {
                    "key": "node.kubernetes.io/not-ready",
                    "effect": "NoSchedule"
                }
            ]
        },
        "status": {
            "capacity": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "64",
                "ephemeral-storage": "194314260Ki",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "129565136Ki",
                "pods": "112"
            },
            "allocatable": {
                "attachable-volumes-aws-ebs": "39",
                "cpu": "62770m",
                "ephemeral-storage": "176798320344",
                "hugepages-1Gi": "0",
                "hugepages-2Mi": "0",
                "hugepages-32Mi": "0",
                "hugepages-64Ki": "0",
                "memory": "126481872Ki",
                "pods": "112"
            },
            "conditions": [
                {
                    "type": "MemoryPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientMemory",
                    "message": "kubelet has sufficient memory available"
                },
                {
                    "type": "DiskPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasNoDiskPressure",
                    "message": "kubelet has no disk pressure"
                },
                {
                    "type": "PIDPressure",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletHasSufficientPID",
                    "message": "kubelet has sufficient PID available"
                },
                {
                    "type": "Ready",
                    "status": "False",
                    "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                    "lastTransitionTime": "2023-11-07T02:54:18Z",
                    "reason": "KubeletNotReady",
                    "message": "[container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?, CSINode is not yet initialized]"
                }
            ],
            "addresses": [
                {
                    "type": "InternalIP",
                    "address": "10.10.11.197"
                },
                {
                    "type": "Hostname",
                    "address": "ip-10-10-11-197.ec2.internal"
                },
                {
                    "type": "InternalDNS",
                    "address": "ip-10-10-11-197.ec2.internal"
                }
            ],
            "daemonEndpoints": {
                "kubeletEndpoint": {
                    "Port": 10250
                }
            },
            "nodeInfo": {
                "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                "kernelVersion": "5.15.119-flatcar",
                "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                "containerRuntimeVersion": "cri-o://1.26.4",
                "kubeletVersion": "v1.26.9",
                "kubeProxyVersion": "v1.26.9",
                "operatingSystem": "linux",
                "architecture": "arm64"
            }
        }
    },
    "requestReceivedTimestamp": "2023-11-07T02:54:22.307670Z",
    "stageTimestamp": "2023-11-07T02:54:22.316479Z",
    "annotations": {
        "authorization.k8s.io/decision": "allow",
        "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"eks:tagging-controller\" of ClusterRole \"eks:tagging-controller\" to ServiceAccount \"tagging-controller/kube-system\""
    }
}

These are before the above two events

jmdeal commented 10 months ago

None of these are from the Karpenter user agent, they're from the aws-cloud-controller-manager. Are you querying these from CloudWatch or some other system? If its CloudWatch what query are you using?

tmoreadobe commented 10 months ago

I am using query patch karpenter and yes it is cloudwatch

jmdeal commented 10 months ago

This is the query I used to grab all of the patch events from Karpenter:

fields @timestamp, @message
| filter objectRef.name = "<node name>"
| filter verb = "patch"
| filter userAgent = "karpenter"
tmoreadobe commented 10 months ago

Great. Here you go

[
    {
        "@timestamp": "2023-11-07 02:54:50.497",
        "@message": {
            "kind": "Event",
            "apiVersion": "audit.k8s.io/v1",
            "level": "RequestResponse",
            "auditID": "d26342c4-a8c3-400d-8cd2-1fa46b2ae2f1",
            "stage": "ResponseComplete",
            "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
            "verb": "patch",
            "user": {
                "username": "system:serviceaccount:karpenter:karpenter",
                "uid": "5472440a-8b46-43e0-9569-2b9e1c5e4114",
                "groups": [
                    "system:serviceaccounts",
                    "system:serviceaccounts:karpenter",
                    "system:authenticated"
                ],
                "extra": {
                    "authentication.kubernetes.io/pod-name": [
                        "core-karpenter-95c7b5f6b-4phmq"
                    ],
                    "authentication.kubernetes.io/pod-uid": [
                        "d0dbe52d-bb3a-42ac-a039-7044e671045b"
                    ]
                }
            },
            "sourceIPs": [
                "XXXXXXXX"
            ],
            "userAgent": "karpenter",
            "objectRef": {
                "resource": "nodes",
                "name": "ip-10-10-11-197.ec2.internal",
                "apiVersion": "v1"
            },
            "responseStatus": {
                "metadata": {},
                "code": 200
            },
            "requestObject": {
                "metadata": {
                    "labels": {
                        "karpenter.sh/initialized": "true"
                    }
                }
            },
            "responseObject": {
                "kind": "Node",
                "apiVersion": "v1",
                "metadata": {
                    "name": "ip-10-10-11-197.ec2.internal",
                    "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
                    "resourceVersion": "110787",
                    "creationTimestamp": "2023-11-07T02:54:21Z",
                    "labels": {
                        "beta.kubernetes.io/arch": "arm64",
                        "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                        "beta.kubernetes.io/os": "linux",
                        "failure-domain.beta.kubernetes.io/region": "us-east-1",
                        "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                        "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                        "karpenter.sh/initialized": "true",
                        "karpenter.sh/registered": "true",
                        "kubernetes.io/arch": "arm64",
                        "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                        "kubernetes.io/os": "linux",
                        "node-role.kubernetes.io/worker": "",
                        "node.kubernetes.io/container-runtime": "cri-o",
                        "node.kubernetes.io/ethos-workload.arm64": "true",
                        "node.kubernetes.io/instance-lifecycle": "normal",
                        "node.kubernetes.io/instance-type": "c6g.16xlarge",
                        "node.kubernetes.io/role": "worker",
                        "topology.kubernetes.io/region": "us-east-1",
                        "topology.kubernetes.io/zone": "us-east-1a"
                    },
                    "annotations": {
                        "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                        "io.cilium.network.ipv4-pod-cidr": "XXXXXXXXX/24",
                        "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                        "karpenter.sh/managed-by": "karpenter-tmore",
                        "karpenter.sh/provisioner-hash": "5643188492113361487",
                        "node.alpha.kubernetes.io/ttl": "0",
                        "volumes.kubernetes.io/controller-managed-attach-detach": "true"
                    },
                    "ownerReferences": [
                        {
                            "apiVersion": "karpenter.sh/v1alpha5",
                            "kind": "Machine",
                            "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                            "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                            "blockOwnerDeletion": true
                        }
                    ],
                    "finalizers": [
                        "karpenter.sh/termination"
                    ],
                    "managedFields": [
                        {
                            "manager": "aws-cloud-controller-manager",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:labels": {
                                        "f:k8s.io/cloud-provider-aws": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        ".": {},
                                        "f:alpha.kubernetes.io/provided-node-ip": {},
                                        "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                                    },
                                    "f:labels": {
                                        ".": {},
                                        "f:beta.kubernetes.io/arch": {},
                                        "f:beta.kubernetes.io/instance-type": {},
                                        "f:beta.kubernetes.io/os": {},
                                        "f:failure-domain.beta.kubernetes.io/region": {},
                                        "f:failure-domain.beta.kubernetes.io/zone": {},
                                        "f:kubernetes.io/arch": {},
                                        "f:kubernetes.io/hostname": {},
                                        "f:kubernetes.io/os": {},
                                        "f:node.kubernetes.io/container-runtime": {},
                                        "f:node.kubernetes.io/ethos-workload.arm64": {},
                                        "f:node.kubernetes.io/instance-lifecycle": {},
                                        "f:node.kubernetes.io/instance-type": {},
                                        "f:node.kubernetes.io/role": {},
                                        "f:topology.kubernetes.io/region": {},
                                        "f:topology.kubernetes.io/zone": {}
                                    }
                                },
                                "f:spec": {
                                    "f:providerID": {}
                                }
                            }
                        },
                        {
                            "manager": "label-maker",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:labels": {
                                        "f:node-role.kubernetes.io/worker": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kubectl-annotate",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:48Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:io.cilium.network.ipv4-pod-cidr": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kube-controller-manager",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:49Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:node.alpha.kubernetes.io/ttl": {}
                                    }
                                },
                                "f:spec": {
                                    "f:taints": {}
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:49Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:status": {
                                    "f:conditions": {
                                        "k:{\"type\":\"DiskPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"MemoryPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"PIDPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"Ready\"}": {
                                            "f:lastHeartbeatTime": {},
                                            "f:lastTransitionTime": {},
                                            "f:message": {},
                                            "f:reason": {},
                                            "f:status": {}
                                        }
                                    },
                                    "f:images": {}
                                }
                            },
                            "subresource": "status"
                        },
                        {
                            "manager": "karpenter",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:50Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                        "f:karpenter.sh/managed-by": {},
                                        "f:karpenter.sh/provisioner-hash": {}
                                    },
                                    "f:finalizers": {
                                        ".": {},
                                        "v:\"karpenter.sh/termination\"": {}
                                    },
                                    "f:labels": {
                                        "f:karpenter.sh/initialized": {},
                                        "f:karpenter.sh/registered": {}
                                    },
                                    "f:ownerReferences": {
                                        ".": {},
                                        "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                                    }
                                }
                            }
                        }
                    ]
                },
                "spec": {
                    "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
                    "taints": [
                        {
                            "key": "ethos.corp.adobe.com/ethos-workload",
                            "value": "arm64",
                            "effect": "NoSchedule"
                        }
                    ]
                },
                "status": {
                    "capacity": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "64",
                        "ephemeral-storage": "194314260Ki",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "129565136Ki",
                        "pods": "112"
                    },
                    "allocatable": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "62770m",
                        "ephemeral-storage": "176798320344",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "126481872Ki",
                        "pods": "112"
                    },
                    "conditions": [
                        {
                            "type": "MemoryPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientMemory",
                            "message": "kubelet has sufficient memory available"
                        },
                        {
                            "type": "DiskPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasNoDiskPressure",
                            "message": "kubelet has no disk pressure"
                        },
                        {
                            "type": "PIDPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientPID",
                            "message": "kubelet has sufficient PID available"
                        },
                        {
                            "type": "Ready",
                            "status": "True",
                            "lastHeartbeatTime": "2023-11-07T02:54:49Z",
                            "lastTransitionTime": "2023-11-07T02:54:49Z",
                            "reason": "KubeletReady",
                            "message": "kubelet is posting ready status"
                        }
                    ],
                    "addresses": [
                        {
                            "type": "InternalIP",
                            "address": "10.10.11.197"
                        },
                        {
                            "type": "Hostname",
                            "address": "ip-10-10-11-197.ec2.internal"
                        },
                        {
                            "type": "InternalDNS",
                            "address": "ip-10-10-11-197.ec2.internal"
                        }
                    ],
                    "daemonEndpoints": {
                        "kubeletEndpoint": {
                            "Port": 10250
                        }
                    },
                    "nodeInfo": {
                        "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                        "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                        "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                        "kernelVersion": "5.15.119-flatcar",
                        "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                        "containerRuntimeVersion": "cri-o://1.26.4",
                        "kubeletVersion": "v1.26.9",
                        "kubeProxyVersion": "v1.26.9",
                        "operatingSystem": "linux",
                        "architecture": "arm64"
                    },
                    "images": [
                        {
                            "names": [
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit@sha256:4b42a45ac64af18c262485817d3e634cd078202d83ebb034e8b3c13f50906694",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit@sha256:9b044ba4f8c8f93f6c4068ac68ae624f3a23aba4cc9f5f3337973d5a164d5b6f",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/ethos/ethos-fluent-bit:1.9.1.2-upstream"
                            ],
                            "sizeBytes": 669628779
                        },
                        {
                            "names": [
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:24d76a5d635f48f290fb1228bacbaf50d6f57911bbb1b9e4317e31483c10d518",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev@sha256:d738a5f7e4f7d8c0b761d50ff527775b5f685ba0176281d9d6569af00f7bb7bb",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent-dev/cilium-dev:v1.12.14-cee.1-877"
                            ],
                            "sizeBytes": 463354425
                        },
                        {
                            "names": [
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:35e03cd57eaf0fa41a3e390e85faf91bedd04776be227176f24ce01779066d58",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy@sha256:d8c8e3e8fe630c3f2d84a22722d4891343196483ac4cc02c1ba9345b1bfc8a3d",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/kubernetes/kube-proxy:v1.26.9"
                            ],
                            "sizeBytes": 63511779
                        },
                        {
                            "names": [
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:280f2d0f44eb284ff58a2c41c59204f3ac288cb4d96e5dac48e3026fc6d431e9",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy@sha256:c022c72c9de525ad5683f58831b5e12eaf885906e6e225183193c1c42db896a2",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/isovalent/cilium-dnsproxy:v1.12.8"
                            ],
                            "sizeBytes": 49445029
                        },
                        {
                            "names": [
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:454aa0877745e5cca6fcb76aee4192b695f7a8cb28bac4c2ad0faa392e421df4",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script@sha256:e1d442546e868db1a3289166c14011e0dbd32115b338b963e56f830972bc22a2",
                                "xxxxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/ethos/66131456-5bcb-11e9-8647-d663bd873d93/cilium/startup-script:62093c5c233ea914bfa26a10ba41f8780d9b737f"
                            ],
                            "sizeBytes": 25322153
                        },
                        {
                            "names": [
                                "registry.k8s.io/pause@sha256:3ec98b8452dc8ae265a6917dfb81587ac78849e520d5dbba6de524851d20eca6",
                                "registry.k8s.io/pause@sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097",
                                "registry.k8s.io/pause:3.9"
                            ],
                            "sizeBytes": 520014
                        }
                    ]
                }
            },
            "requestReceivedTimestamp": "2023-11-07T02:54:50.415229Z",
            "stageTimestamp": "2023-11-07T02:54:50.435761Z",
            "annotations": {
                "authorization.k8s.io/decision": "allow",
                "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"core-karpenter-core\" of ClusterRole \"core-karpenter-core\" to ServiceAccount \"karpenter/karpenter\""
            }
        }
    },
    {
        "@timestamp": "2023-11-07 02:54:23.120",
        "@message": {
            "kind": "Event",
            "apiVersion": "audit.k8s.io/v1",
            "level": "RequestResponse",
            "auditID": "7aa2340f-44c0-40d3-ae15-3858937efa8a",
            "stage": "ResponseComplete",
            "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
            "verb": "patch",
            "user": {
                "username": "system:serviceaccount:karpenter:karpenter",
                "uid": "5472440a-8b46-43e0-9569-2b9e1c5e4114",
                "groups": [
                    "system:serviceaccounts",
                    "system:serviceaccounts:karpenter",
                    "system:authenticated"
                ],
                "extra": {
                    "authentication.kubernetes.io/pod-name": [
                        "core-karpenter-95c7b5f6b-p6mm4"
                    ],
                    "authentication.kubernetes.io/pod-uid": [
                        "4847f6fa-1e23-4e54-a0ac-f9ec554772dc"
                    ]
                }
            },
            "sourceIPs": [
                "xxxxxxxx"
            ],
            "userAgent": "karpenter",
            "objectRef": {
                "resource": "nodes",
                "name": "ip-10-10-11-197.ec2.internal",
                "apiVersion": "v1"
            },
            "responseStatus": {
                "metadata": {},
                "code": 200
            },
            "requestObject": {
                "metadata": {
                    "labels": {
                        "karpenter.sh/registered": "true"
                    }
                }
            },
            "responseObject": {
                "kind": "Node",
                "apiVersion": "v1",
                "metadata": {
                    "name": "ip-10-10-11-197.ec2.internal",
                    "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
                    "resourceVersion": "109623",
                    "creationTimestamp": "2023-11-07T02:54:21Z",
                    "labels": {
                        "beta.kubernetes.io/arch": "arm64",
                        "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                        "beta.kubernetes.io/os": "linux",
                        "failure-domain.beta.kubernetes.io/region": "us-east-1",
                        "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                        "k8s.io/cloud-provider-aws": "6d2511cd5c3086d8e96592c5e706198a",
                        "karpenter.sh/registered": "true",
                        "kubernetes.io/arch": "arm64",
                        "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                        "kubernetes.io/os": "linux",
                        "node-role.kubernetes.io/worker": "",
                        "node.kubernetes.io/container-runtime": "cri-o",
                        "node.kubernetes.io/ethos-workload.arm64": "true",
                        "node.kubernetes.io/instance-lifecycle": "normal",
                        "node.kubernetes.io/instance-type": "c6g.16xlarge",
                        "node.kubernetes.io/role": "worker",
                        "topology.kubernetes.io/region": "us-east-1",
                        "topology.kubernetes.io/zone": "us-east-1a"
                    },
                    "annotations": {
                        "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                        "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                        "karpenter.sh/managed-by": "karpenter-tmore",
                        "karpenter.sh/provisioner-hash": "5643188492113361487",
                        "node.alpha.kubernetes.io/ttl": "0",
                        "volumes.kubernetes.io/controller-managed-attach-detach": "true"
                    },
                    "ownerReferences": [
                        {
                            "apiVersion": "karpenter.sh/v1alpha5",
                            "kind": "Machine",
                            "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                            "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                            "blockOwnerDeletion": true
                        }
                    ],
                    "finalizers": [
                        "karpenter.sh/termination"
                    ],
                    "managedFields": [
                        {
                            "manager": "aws-cloud-controller-manager",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:labels": {
                                        "f:k8s.io/cloud-provider-aws": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kube-controller-manager",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:node.alpha.kubernetes.io/ttl": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        ".": {},
                                        "f:alpha.kubernetes.io/provided-node-ip": {},
                                        "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                                    },
                                    "f:labels": {
                                        ".": {},
                                        "f:beta.kubernetes.io/arch": {},
                                        "f:beta.kubernetes.io/instance-type": {},
                                        "f:beta.kubernetes.io/os": {},
                                        "f:failure-domain.beta.kubernetes.io/region": {},
                                        "f:failure-domain.beta.kubernetes.io/zone": {},
                                        "f:kubernetes.io/arch": {},
                                        "f:kubernetes.io/hostname": {},
                                        "f:kubernetes.io/os": {},
                                        "f:node.kubernetes.io/container-runtime": {},
                                        "f:node.kubernetes.io/ethos-workload.arm64": {},
                                        "f:node.kubernetes.io/instance-lifecycle": {},
                                        "f:node.kubernetes.io/instance-type": {},
                                        "f:node.kubernetes.io/role": {},
                                        "f:topology.kubernetes.io/region": {},
                                        "f:topology.kubernetes.io/zone": {}
                                    }
                                },
                                "f:spec": {
                                    "f:providerID": {},
                                    "f:taints": {}
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:status": {
                                    "f:conditions": {
                                        "k:{\"type\":\"DiskPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"MemoryPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"PIDPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"Ready\"}": {
                                            "f:lastHeartbeatTime": {}
                                        }
                                    }
                                }
                            },
                            "subresource": "status"
                        },
                        {
                            "manager": "label-maker",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:labels": {
                                        "f:node-role.kubernetes.io/worker": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "karpenter",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:22Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                        "f:karpenter.sh/managed-by": {},
                                        "f:karpenter.sh/provisioner-hash": {}
                                    },
                                    "f:finalizers": {
                                        ".": {},
                                        "v:\"karpenter.sh/termination\"": {}
                                    },
                                    "f:labels": {
                                        "f:karpenter.sh/registered": {}
                                    },
                                    "f:ownerReferences": {
                                        ".": {},
                                        "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                                    }
                                }
                            }
                        }
                    ]
                },
                "spec": {
                    "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
                    "taints": [
                        {
                            "key": "ethos.corp.adobe.com/ethos-workload",
                            "value": "arm64",
                            "effect": "NoSchedule"
                        },
                        {
                            "key": "node.kubernetes.io/not-ready",
                            "effect": "NoSchedule"
                        }
                    ]
                },
                "status": {
                    "capacity": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "64",
                        "ephemeral-storage": "194314260Ki",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "129565136Ki",
                        "pods": "112"
                    },
                    "allocatable": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "62770m",
                        "ephemeral-storage": "176798320344",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "126481872Ki",
                        "pods": "112"
                    },
                    "conditions": [
                        {
                            "type": "MemoryPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientMemory",
                            "message": "kubelet has sufficient memory available"
                        },
                        {
                            "type": "DiskPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasNoDiskPressure",
                            "message": "kubelet has no disk pressure"
                        },
                        {
                            "type": "PIDPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientPID",
                            "message": "kubelet has sufficient PID available"
                        },
                        {
                            "type": "Ready",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletNotReady",
                            "message": "[container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?, CSINode is not yet initialized]"
                        }
                    ],
                    "addresses": [
                        {
                            "type": "InternalIP",
                            "address": "10.10.11.197"
                        },
                        {
                            "type": "Hostname",
                            "address": "ip-10-10-11-197.ec2.internal"
                        },
                        {
                            "type": "InternalDNS",
                            "address": "ip-10-10-11-197.ec2.internal"
                        }
                    ],
                    "daemonEndpoints": {
                        "kubeletEndpoint": {
                            "Port": 10250
                        }
                    },
                    "nodeInfo": {
                        "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                        "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                        "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                        "kernelVersion": "5.15.119-flatcar",
                        "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                        "containerRuntimeVersion": "cri-o://1.26.4",
                        "kubeletVersion": "v1.26.9",
                        "kubeProxyVersion": "v1.26.9",
                        "operatingSystem": "linux",
                        "architecture": "arm64"
                    }
                }
            },
            "requestReceivedTimestamp": "2023-11-07T02:54:22.957610Z",
            "stageTimestamp": "2023-11-07T02:54:22.981954Z",
            "annotations": {
                "authorization.k8s.io/decision": "allow",
                "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"core-karpenter-core\" of ClusterRole \"core-karpenter-core\" to ServiceAccount \"karpenter/karpenter\""
            }
        }
    },
    {
        "@timestamp": "2023-11-07 02:54:21.883",
        "@message": {
            "kind": "Event",
            "apiVersion": "audit.k8s.io/v1",
            "level": "RequestResponse",
            "auditID": "b8975178-7d38-49a8-8666-3819336ccd5b",
            "stage": "ResponseComplete",
            "requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
            "verb": "patch",
            "user": {
                "username": "system:serviceaccount:karpenter:karpenter",
                "uid": "5472440a-8b46-43e0-9569-2b9e1c5e4114",
                "groups": [
                    "system:serviceaccounts",
                    "system:serviceaccounts:karpenter",
                    "system:authenticated"
                ],
                "extra": {
                    "authentication.kubernetes.io/pod-name": [
                        "core-karpenter-95c7b5f6b-p6mm4"
                    ],
                    "authentication.kubernetes.io/pod-uid": [
                        "4847f6fa-1e23-4e54-a0ac-f9ec554772dc"
                    ]
                }
            },
            "sourceIPs": [
                "xxxxxxxx"
            ],
            "userAgent": "karpenter",
            "objectRef": {
                "resource": "nodes",
                "name": "ip-10-10-11-197.ec2.internal",
                "apiVersion": "v1"
            },
            "responseStatus": {
                "metadata": {},
                "code": 200
            },
            "requestObject": {
                "metadata": {
                    "annotations": {
                        "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                        "karpenter.sh/managed-by": "karpenter-tmore",
                        "karpenter.sh/provisioner-hash": "5643188492113361487"
                    },
                    "finalizers": [
                        "karpenter.sh/termination"
                    ],
                    "labels": {
                        "ethos.adobe.net/node-templateVersion": "8dff9db0cf7279d779d8c7e3d1737b07bd54803e65357871d9fe09d441bc12",
                        "karpenter.k8s.aws/instance-category": "c",
                        "karpenter.k8s.aws/instance-cpu": "64",
                        "karpenter.k8s.aws/instance-encryption-in-transit-supported": "false",
                        "karpenter.k8s.aws/instance-family": "c6g",
                        "karpenter.k8s.aws/instance-generation": "6",
                        "karpenter.k8s.aws/instance-hypervisor": "nitro",
                        "karpenter.k8s.aws/instance-memory": "131072",
                        "karpenter.k8s.aws/instance-network-bandwidth": "25000",
                        "karpenter.k8s.aws/instance-pods": "737",
                        "karpenter.k8s.aws/instance-size": "16xlarge",
                        "karpenter.sh/capacity-type": "on-demand",
                        "karpenter.sh/provisioner-name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge",
                        "karpenter.sh/registered": "true"
                    },
                    "ownerReferences": [
                        {
                            "apiVersion": "karpenter.sh/v1alpha5",
                            "blockOwnerDeletion": true,
                            "kind": "Machine",
                            "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                            "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46"
                        }
                    ]
                }
            },
            "responseObject": {
                "kind": "Node",
                "apiVersion": "v1",
                "metadata": {
                    "name": "ip-10-10-11-197.ec2.internal",
                    "uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
                    "resourceVersion": "109580",
                    "creationTimestamp": "2023-11-07T02:54:21Z",
                    "labels": {
                        "beta.kubernetes.io/arch": "arm64",
                        "beta.kubernetes.io/instance-type": "c6g.16xlarge",
                        "beta.kubernetes.io/os": "linux",
                        "ethos.adobe.net/node-templateVersion": "8dff9db0cf7279d779d8c7e3d1737b07bd54803e65357871d9fe09d441bc12",
                        "failure-domain.beta.kubernetes.io/region": "us-east-1",
                        "failure-domain.beta.kubernetes.io/zone": "us-east-1a",
                        "karpenter.k8s.aws/instance-category": "c",
                        "karpenter.k8s.aws/instance-cpu": "64",
                        "karpenter.k8s.aws/instance-encryption-in-transit-supported": "false",
                        "karpenter.k8s.aws/instance-family": "c6g",
                        "karpenter.k8s.aws/instance-generation": "6",
                        "karpenter.k8s.aws/instance-hypervisor": "nitro",
                        "karpenter.k8s.aws/instance-memory": "131072",
                        "karpenter.k8s.aws/instance-network-bandwidth": "25000",
                        "karpenter.k8s.aws/instance-pods": "737",
                        "karpenter.k8s.aws/instance-size": "16xlarge",
                        "karpenter.sh/capacity-type": "on-demand",
                        "karpenter.sh/provisioner-name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge",
                        "karpenter.sh/registered": "true",
                        "kubernetes.io/arch": "arm64",
                        "kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
                        "kubernetes.io/os": "linux",
                        "node.kubernetes.io/container-runtime": "cri-o",
                        "node.kubernetes.io/ethos-workload.arm64": "true",
                        "node.kubernetes.io/instance-lifecycle": "normal",
                        "node.kubernetes.io/instance-type": "c6g.16xlarge",
                        "node.kubernetes.io/role": "worker",
                        "topology.kubernetes.io/region": "us-east-1",
                        "topology.kubernetes.io/zone": "us-east-1a"
                    },
                    "annotations": {
                        "alpha.kubernetes.io/provided-node-ip": "10.10.11.197",
                        "karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
                        "karpenter.sh/managed-by": "karpenter-tmore",
                        "karpenter.sh/provisioner-hash": "5643188492113361487",
                        "node.alpha.kubernetes.io/ttl": "0",
                        "volumes.kubernetes.io/controller-managed-attach-detach": "true"
                    },
                    "ownerReferences": [
                        {
                            "apiVersion": "karpenter.sh/v1alpha5",
                            "kind": "Machine",
                            "name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge-zht4r",
                            "uid": "e287ccbe-a808-4a5f-ae38-d03d5d7efc46",
                            "blockOwnerDeletion": true
                        }
                    ],
                    "finalizers": [
                        "karpenter.sh/termination"
                    ],
                    "managedFields": [
                        {
                            "manager": "karpenter",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:karpenter.k8s.aws/nodetemplate-hash": {},
                                        "f:karpenter.sh/managed-by": {},
                                        "f:karpenter.sh/provisioner-hash": {}
                                    },
                                    "f:finalizers": {
                                        ".": {},
                                        "v:\"karpenter.sh/termination\"": {}
                                    },
                                    "f:labels": {
                                        "f:ethos.adobe.net/node-templateVersion": {},
                                        "f:karpenter.k8s.aws/instance-category": {},
                                        "f:karpenter.k8s.aws/instance-cpu": {},
                                        "f:karpenter.k8s.aws/instance-encryption-in-transit-supported": {},
                                        "f:karpenter.k8s.aws/instance-family": {},
                                        "f:karpenter.k8s.aws/instance-generation": {},
                                        "f:karpenter.k8s.aws/instance-hypervisor": {},
                                        "f:karpenter.k8s.aws/instance-memory": {},
                                        "f:karpenter.k8s.aws/instance-network-bandwidth": {},
                                        "f:karpenter.k8s.aws/instance-pods": {},
                                        "f:karpenter.k8s.aws/instance-size": {},
                                        "f:karpenter.sh/capacity-type": {},
                                        "f:karpenter.sh/provisioner-name": {},
                                        "f:karpenter.sh/registered": {}
                                    },
                                    "f:ownerReferences": {
                                        ".": {},
                                        "k:{\"uid\":\"e287ccbe-a808-4a5f-ae38-d03d5d7efc46\"}": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kube-controller-manager",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        "f:node.alpha.kubernetes.io/ttl": {}
                                    }
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:metadata": {
                                    "f:annotations": {
                                        ".": {},
                                        "f:alpha.kubernetes.io/provided-node-ip": {},
                                        "f:volumes.kubernetes.io/controller-managed-attach-detach": {}
                                    },
                                    "f:labels": {
                                        ".": {},
                                        "f:beta.kubernetes.io/arch": {},
                                        "f:beta.kubernetes.io/instance-type": {},
                                        "f:beta.kubernetes.io/os": {},
                                        "f:failure-domain.beta.kubernetes.io/region": {},
                                        "f:failure-domain.beta.kubernetes.io/zone": {},
                                        "f:kubernetes.io/arch": {},
                                        "f:kubernetes.io/hostname": {},
                                        "f:kubernetes.io/os": {},
                                        "f:node.kubernetes.io/container-runtime": {},
                                        "f:node.kubernetes.io/ethos-workload.arm64": {},
                                        "f:node.kubernetes.io/instance-lifecycle": {},
                                        "f:node.kubernetes.io/instance-type": {},
                                        "f:node.kubernetes.io/role": {},
                                        "f:topology.kubernetes.io/region": {},
                                        "f:topology.kubernetes.io/zone": {}
                                    }
                                },
                                "f:spec": {
                                    "f:providerID": {},
                                    "f:taints": {}
                                }
                            }
                        },
                        {
                            "manager": "kubelet",
                            "operation": "Update",
                            "apiVersion": "v1",
                            "time": "2023-11-07T02:54:21Z",
                            "fieldsType": "FieldsV1",
                            "fieldsV1": {
                                "f:status": {
                                    "f:conditions": {
                                        "k:{\"type\":\"DiskPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"MemoryPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"PIDPressure\"}": {
                                            "f:lastHeartbeatTime": {}
                                        },
                                        "k:{\"type\":\"Ready\"}": {
                                            "f:lastHeartbeatTime": {}
                                        }
                                    }
                                }
                            },
                            "subresource": "status"
                        }
                    ]
                },
                "spec": {
                    "providerID": "aws:///us-east-1a/i-00b70341f59a5bf3b",
                    "taints": [
                        {
                            "key": "ethos.corp.adobe.com/ethos-workload",
                            "value": "arm64",
                            "effect": "NoSchedule"
                        },
                        {
                            "key": "node.kubernetes.io/not-ready",
                            "effect": "NoSchedule"
                        }
                    ]
                },
                "status": {
                    "capacity": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "64",
                        "ephemeral-storage": "194314260Ki",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "129565136Ki",
                        "pods": "112"
                    },
                    "allocatable": {
                        "attachable-volumes-aws-ebs": "39",
                        "cpu": "62770m",
                        "ephemeral-storage": "176798320344",
                        "hugepages-1Gi": "0",
                        "hugepages-2Mi": "0",
                        "hugepages-32Mi": "0",
                        "hugepages-64Ki": "0",
                        "memory": "126481872Ki",
                        "pods": "112"
                    },
                    "conditions": [
                        {
                            "type": "MemoryPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientMemory",
                            "message": "kubelet has sufficient memory available"
                        },
                        {
                            "type": "DiskPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasNoDiskPressure",
                            "message": "kubelet has no disk pressure"
                        },
                        {
                            "type": "PIDPressure",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletHasSufficientPID",
                            "message": "kubelet has sufficient PID available"
                        },
                        {
                            "type": "Ready",
                            "status": "False",
                            "lastHeartbeatTime": "2023-11-07T02:54:21Z",
                            "lastTransitionTime": "2023-11-07T02:54:18Z",
                            "reason": "KubeletNotReady",
                            "message": "[container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: No CNI configuration file in /etc/cni/net.d/. Has your network provider started?, CSINode is not yet initialized]"
                        }
                    ],
                    "addresses": [
                        {
                            "type": "InternalIP",
                            "address": "10.10.11.197"
                        },
                        {
                            "type": "Hostname",
                            "address": "ip-10-10-11-197.ec2.internal"
                        },
                        {
                            "type": "InternalDNS",
                            "address": "ip-10-10-11-197.ec2.internal"
                        }
                    ],
                    "daemonEndpoints": {
                        "kubeletEndpoint": {
                            "Port": 10250
                        }
                    },
                    "nodeInfo": {
                        "machineID": "ec27cd6a9976e04725d174a44773b3fb",
                        "systemUUID": "ec27cd6a-9976-e047-25d1-74a44773b3fb",
                        "bootID": "ccd82a39-d07b-47f5-88f8-cfbab2da5a51",
                        "kernelVersion": "5.15.119-flatcar",
                        "osImage": "Flatcar Container Linux by Kinvolk 3510.2.5 (Oklo)",
                        "containerRuntimeVersion": "cri-o://1.26.4",
                        "kubeletVersion": "v1.26.9",
                        "kubeProxyVersion": "v1.26.9",
                        "operatingSystem": "linux",
                        "architecture": "arm64"
                    }
                }
            },
            "requestReceivedTimestamp": "2023-11-07T02:54:21.798359Z",
            "stageTimestamp": "2023-11-07T02:54:21.842653Z",
            "annotations": {
                "authorization.k8s.io/decision": "allow",
                "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"core-karpenter-core\" of ClusterRole \"core-karpenter-core\" to ServiceAccount \"karpenter/karpenter\""
            }
        }
    }
]
jmdeal commented 10 months ago

Interesting, so the karpenter.sh/capacity-type label was on the node but was later removed (along with the karpenter.k8s.aws/* labels). Could you check the audit log for any other processes that may be removing these labels? Alternatively, you could open a ticket with EKS including your cluster name and account ID so we can investigate directly, assuming this is an EKS cluster.

tmoreadobe commented 10 months ago

Yes this is an EKS cluster. Let me open a ticket with AWS.

jmdeal commented 10 months ago

I spoke with the support team and it sounds like you were able to track down the problem to another process overwriting the node's labels. Have you had any luck in validating this?

tmoreadobe commented 10 months ago

Apologies, I haven't been able to validate this since I am working on something but I will update here by end of today.

tmoreadobe commented 9 months ago

We can close this issue. But it seems like there has been changes how labels are being applied after 0.28+ versions. Since the karpenter prior to 0.28 was able to work with label-maker pods.

jmdeal commented 9 months ago

The big change for v0.28 was Karpenter's node ownership model with the introduction of the Machine resource. There shouldn't be any difference in what tags are applied (or the order) but they are patched onto the node after its created rather than being present at creation (at least if you're using a custom AMI).

jonathan-innis commented 9 months ago

overwriting the node's labels

@jmdeal We should think about using SSA (look into it, it's pretty cool and a fun read) so that we can avoid other controllers overwriting our labels without getting conflict errors.

tmoreadobe commented 8 months ago

@jonathan-innis are there any plans to have server side patching for the labels anytime soon? Thanks!

jonathan-innis commented 8 months ago

are there any plans to have server side patching for the labels anytime soon? Thanks

Thinking more about this, I was trying to reason about how SSA based applies would help here. Did we ever track-down the reason why there was a race between Karpenter's applying of the labels and the other controller? Since Karpenter is applying labels to a map, even without SSA, assuming the other controller is using a Patch call, they shouldn't conflict with each other (unless the other controller is messing with the label keys that Karpenter is managing).

However, if the other controller is performing an Update call, this should use optimistic locking, assuming you aren't just updating the resource version and forcing the override and the update.

Do you know which one the other controller was using that was causing the label to dissappear?

tmoreadobe commented 7 months ago

I believe the other controller is doing patch call and what happening is when karpenter wins the race and applies labels, the other controller is unaware of the labels karpenter has applied and it just adds the label and karpenter labels don't come in the picture. The only solution might have been any one of the controller does server side apply and since the other controller is EOL, patching the code with busted build process is hell of a task.

jonathan-innis commented 7 months ago

The PATCH operation in Kubernetes should work like a standard JSON merge patch where it will only produce a change request for the fields that it's actually modifying. When it comes to labels, those fields have different keys, so JSON merge patching should be able to reason about the fact that one call is updating one set of labels and the other is updating another.

You really only run into race conflicts with K8s objects when:

  1. You have two clients making updates to the same field (this is what SSA was built to fix to some degree)
  2. You have a client that is performing a full update and is not respecting the resource version changes that happened between its GET/PUT operations

From all of the queries up above, I noticed that we didn't ever take a look at UPDATE events. Having those would give us a clear picture here. Given the fact that we can't see the controller that was responsible for removing the labels from the patch events, I suspect that that controller is doing an UPDATE and seeing the request object that it's sending in could help close on if that controller's logic is incorrect.

since the other controller is EOL, patching the code with busted build process is hell of a task

If the other controller is doing an UPDATE and overwriting the resource version, even a SSA won't help here unfortunately.

tmoreadobe commented 7 months ago

@jonathan-innis yes I have the logs. This is from the aws support and they diagnosed this

1. label-maker pods tries to update the node's label but fails due to "Operation cannot be fulfilled on nodes \"ip-10-10-11-197.ec2.internal\": the object has been modified; please apply your changes to the latest version and try again":

"@timestamp": "2023-11-07 02:54:21.882",
"@message": {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "b1a45439-4ddc-4012-8af4-7380fc2fe505",
"stage": "ResponseComplete",
"requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
"verb": "update",
...
"user": {
"username": "system:serviceaccount:kube-system:label-maker",
...
"responseStatus": {
"metadata": {},
"status": "Failure",
"message": "Operation cannot be fulfilled on nodes \"ip-10-10-11-197.ec2.internal\": the object has been modified; please apply your changes to the latest version and try again",
"reason": "Conflict",
"details": {
"name": "ip-10-10-11-197.ec2.internal",
"kind": "nodes"
},
"code": 409
},
...
"requestObject": {
"kind": "Node",
"apiVersion": "v1",
"metadata": {
"name": "ip-10-10-11-197.ec2.internal",
"uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
"resourceVersion": "109570",
"creationTimestamp": "2023-11-07T02:54:21Z",
"labels": {
"beta.kubernetes.io/arch": "arm64",
"beta.kubernetes.io/instance-type": "c6g.16xlarge",
"beta.kubernetes.io/os": "linux",
"failure-domain.beta.kubernetes.io/region": "us-east-1",
"failure-domain.beta.kubernetes.io/zone": "us-east-1a",
"kubernetes.io/arch": "arm64",
"kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/worker": "",
"node.kubernetes.io/container-runtime": "cri-o",
"node.kubernetes.io/ethos-workload.arm64": "true",
"node.kubernetes.io/instance-lifecycle": "normal",
"node.kubernetes.io/instance-type": "c6g.16xlarge",
"node.kubernetes.io/role": "worker",
"topology.kubernetes.io/region": "us-east-1",
"topology.kubernetes.io/zone": "us-east-1a"
}
...

2. Karpenter patches the labels on the node:

"@timestamp": "2023-11-07 02:54:21.883",
"@message": {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "b8975178-7d38-49a8-8666-3819336ccd5b",
"stage": "ResponseComplete",
"requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
"verb": "patch",
"user": {
"username": "system:serviceaccount:karpenter:karpenter",
...
"responseStatus": {
"metadata": {},
"code": 200
},
"requestObject": {
"metadata": {
"annotations": {
"karpenter.k8s.aws/nodetemplate-hash": "9451254827736767782",
"karpenter.sh/managed-by": "karpenter-tmore",
"karpenter.sh/provisioner-hash": "5643188492113361487"
},
"finalizers": [
"karpenter.sh/termination"
],
"labels": {
"ethos.adobe.net/node-templateVersion": "8dff9db0cf7279d779d8c7e3d1737b07bd54803e65357871d9fe09d441bc12",
"karpenter.k8s.aws/instance-category": "c",
"karpenter.k8s.aws/instance-cpu": "64",
"karpenter.k8s.aws/instance-encryption-in-transit-supported": "false",
"karpenter.k8s.aws/instance-family": "c6g",
"karpenter.k8s.aws/instance-generation": "6",
"karpenter.k8s.aws/instance-hypervisor": "nitro",
"karpenter.k8s.aws/instance-memory": "131072",
"karpenter.k8s.aws/instance-network-bandwidth": "25000",
"karpenter.k8s.aws/instance-pods": "737",
"karpenter.k8s.aws/instance-size": "16xlarge",
"karpenter.sh/capacity-type": "on-demand",
"karpenter.sh/provisioner-name": "core-karpenter-ethos-core-karpenter-armc6g4xlarge",
"karpenter.sh/registered": "true"
},
...

3. Then label-maker pod updates the node again shortly after but without the labels from the Karpenter pod:

"@timestamp": "2023-11-07 02:54:22.439",
"@message": {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "bd7356c9-afa7-4e9c-8a76-24142503acdc",
"stage": "ResponseComplete",
"requestURI": "/api/v1/nodes/ip-10-10-11-197.ec2.internal",
"verb": "update",
"user": {
"username": "system:serviceaccount:kube-system:label-maker",
…
"responseStatus": {
"metadata": {},
"code": 200
},
"requestObject": {
"kind": "Node",
"apiVersion": "v1",
"metadata": {
"name": "ip-10-10-11-197.ec2.internal",
"uid": "5da78df1-3a7b-4186-8773-e3a22e695735",
"resourceVersion": "109580",
"creationTimestamp": "2023-11-07T02:54:21Z",
"labels": {
"beta.kubernetes.io/arch": "arm64",
"beta.kubernetes.io/instance-type": "c6g.16xlarge",
"beta.kubernetes.io/os": "linux",
"failure-domain.beta.kubernetes.io/region": "us-east-1",
"failure-domain.beta.kubernetes.io/zone": "us-east-1a",
"kubernetes.io/arch": "arm64",
"kubernetes.io/hostname": "ip-10-10-11-197.ec2.internal",
"kubernetes.io/os": "linux",
"node-role.kubernetes.io/worker": "",
"node.kubernetes.io/container-runtime": "cri-o",
"node.kubernetes.io/ethos-workload.arm64": "true",
"node.kubernetes.io/instance-lifecycle": "normal",
"node.kubernetes.io/instance-type": "c6g.16xlarge",
"node.kubernetes.io/role": "worker",
"topology.kubernetes.io/region": "us-east-1",
"topology.kubernetes.io/zone": "us-east-1a"
},
tmoreadobe commented 7 months ago

This has been veified once the deployment for label-maker has been scaled down, karpenter labels have been applied just fine.

jonathan-innis commented 7 months ago

This has been veified once the deployment for label-maker has been scaled down, karpenter labels have been applied just fine

If label-maker is OSS, it would be nice to link to the portion where it is patching details onto the nodes to see exactly what logic they have implemented. Either way, it sounds like even SSA wouldn't have mattered here. I think it still makes sense for us to do it for other reasons but I don't think anything could have stopped this issue on the Karpenter-side if label-maker is doing an UPDATE while not respecting the resource version.

tmoreadobe commented 7 months ago

Yes let me add it here what label maker is doing https://github.com/tmoreadobe/label-maker/blob/master/pkg/controller/labelmaker/labelmaker_controller.go#L107-L136