aws / karpenter-provider-aws

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.
https://karpenter.sh
Apache License 2.0
6.16k stars 849 forks source link

Report no error when EKS Pod Identity agent missing #6359

Closed adaam closed 2 weeks ago

adaam commented 3 weeks ago

Description

Observed Behavior: When it lack of Pod Identity agent in EKS cluster, in logLevel debug, it only show discovered karpenter version then nothing, pending sometime and exit(1) Expected Behavior: Looking into code it should report error with ec2api, when I test the pod identity with same service account and curl http://169.254.170.23/v1/credentials, it just pending, not sure if it like issue with no timeout? Reproduction Steps (Please include YAML):

  1. Using a EKS cluster without Pod Identity agent (fresh install, but installed then remove)
  2. Install karpenter AWS resources, I'm using terraform like here
  3. Install karpenter with helm Versions:
    • Chart Version: 0.37.0, 0.36.1
    • Kubernetes Version (kubectl version): 1.30
jmdeal commented 2 weeks ago

I can't seem to reproduce, removing the pod identity agent in a 1.30 cluster results in the following error when I restart my controller:

{"level":"DEBUG","time":"2024-06-14T22:33:24.704Z","logger":"controller","message":"discovered karpenter version","commit":"f7225ed-dirty","version":"0.37.0-15-gf7225edf"}
{"level":"ERROR","time":"2024-06-14T22:33:25.001Z","logger":"controller","message":"ec2 api connectivity check failed","commit":"f7225ed-dirty","error":"NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"}

Is this consistently reproduceable? The only times Karpenter explicitly calls exit(1) are proceeded by a log line.

adaam commented 2 weeks ago

@jmdeal You are right. I test it with remove pod identity agent (original installed). It does show the same log like you list. I tried it again today in another cluster which needs karpenter. I install it with no pod identity agent, the issue do happened.

{"level":"DEBUG","time":"2024-06-15T06:17:00.576Z","logger":"controller","message":"discovered karpenter version","commit":"490ef94","version":"0.37.0"}
{"level":"DEBUG","time":"2024-06-15T06:19:01.034Z","logger":"controller","message":"discovered karpenter version","commit":"490ef94","version":"0.37.0"}
{"level":"DEBUG","time":"2024-06-15T06:21:01.555Z","logger":"controller","message":"discovered karpenter version","commit":"490ef94","version":"0.37.0"}
adaam commented 2 weeks ago

Sorry, after seeing log in previous container log, it does show error there. I will close this ticket since it's working correctly.