cilium / cilium-cli

CLI to install, manage & troubleshoot Kubernetes clusters running Cilium
https://cilium.io
Apache License 2.0
390 stars 196 forks source link

CI: GKE test failing due to Cilium test pods getting evicted #2594

Open giorio94 opened 2 weeks ago

giorio94 commented 2 weeks ago

CI failure

Hit on https://github.com/cilium/cilium-cli/pull/2591 Link: https://github.com/cilium/cilium-cli/actions/runs/9478248176/job/26139294496?pr=2591 Sysdump: cilium-sysdump-out.zip

From the sysdump:

NAMESPACE     NAME                                                             READY   STATUS      RESTARTS   AGE     IP             NODE                                                  NOMINATED NODE   READINESS GATES
cilium-test   client-69748f45d8-5mj4r                                          1/1     Running     0          5m27s   10.84.97.177   gke-cilium-cilium-cli-94-default-pool-d56030b3-cdp6   <none>           <none>
cilium-test   client2-ccd7b8bdf-n7ffh                                          1/1     Running     0          5m27s   10.84.97.55    gke-cilium-cilium-cli-94-default-pool-d56030b3-cdp6   <none>           <none>
cilium-test   client3-868f7b8f6b-t2vvx                                         1/1     Running     0          5m26s   10.84.96.185   gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd   <none>           <none>
cilium-test   echo-other-node-5d67f9786b-k86xh                                 0/2     Completed   0          5m26s   <none>         gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd   <none>           <none>
cilium-test   echo-other-node-5d67f9786b-wgklc                                 0/2     Pending     0          5m7s    <none>         <none>                                                <none>           <none>
cilium-test   echo-same-node-6698bd45b-4t7bj                                   2/2     Running     0          5m27s   10.84.97.84    gke-cilium-cilium-cli-94-default-pool-d56030b3-cdp6   <none>           <none>
cilium-test   host-netns-9jl8x                                                 1/1     Running     0          5m26s   10.84.89.195   gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd   <none>           <none>
cilium-test   host-netns-pgjn4                                                 1/1     Running     0          5m26s   10.84.89.196   gke-cilium-cilium-cli-94-default-pool-d56030b3-cdp6   <none>           <none>
- count: 1
  eventTime: null
  firstTimestamp: "2024-06-12T16:18:23Z"
  involvedObject:
    apiVersion: v1
    kind: Pod
    name: echo-other-node-5d67f9786b-k86xh
    namespace: cilium-test
    resourceVersion: "3439"
    uid: 09fd1523-f1e9-449e-8b11-b625c48c13e6
  lastTimestamp: "2024-06-12T16:18:23Z"
  message: 'The node was low on resource: ephemeral-storage. Threshold quantity: 609489314,
    available: 496400Ki. '
  metadata:
    creationTimestamp: "2024-06-12T16:18:24Z"
    name: echo-other-node-5d67f9786b-k86xh.17d84e139dba3e2e
    namespace: cilium-test
  reason: Evicted
  reportingComponent: kubelet
  reportingInstance: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
  source:
    component: kubelet
    host: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
  type: Warning

- count: 2
  eventTime: null
  firstTimestamp: "2024-06-12T16:18:23Z"
  involvedObject:
    kind: Node
    name: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
    uid: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
  lastTimestamp: "2024-06-12T16:18:32Z"
  message: Attempting to reclaim ephemeral-storage
  metadata:
    name: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd.17d84e139809e0d2
    namespace: default
    resourceVersion: "736"
    uid: f51f5c3b-3fc4-4275-b451-d1f156a93d91
  reason: EvictionThresholdMet
  reportingComponent: kubelet
  reportingInstance: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
  source:
    component: kubelet
    host: gke-cilium-cilium-cli-94-default-pool-d56030b3-l3rd
  type: Warning