SAP-archive / karydia

Kubernetes Security Walnut
Other
77 stars 10 forks source link

Pod Deletion Gets Stuck #250

Closed Neumann-Nils closed 4 years ago

Neumann-Nils commented 4 years ago

Description

With Karydia in your cluster, it can happen that a pod will not be deleted. It will stay in the state terminating forever. When e.g. scaling down Karydia to 0 replicas, the pod deletion will finish.

Steps to reproduce

  1. Create a pod in a namespace different from default (e.g. test) on a new cluster without Karydia running
  2. Install Karydia
  3. Delete the pod

The same behavior can be observerd when scaling Karydia down to 0 replicas first and later scaling it up to e.g. 1 replica (in the real world this could be the case when e.g. hibernating a cluster).

Expected behavior

When I delete a pod, it should be deleted within an acceptable timeframe and should not get stuck in the state terminating.

Logs / console output / screenshots / affected lines of code

$ kubectl delete pod busybox-sleep -n test
pod "busybox-sleep" deleted
$ kubectl get pods -n test
NAME            READY   STATUS        RESTARTS   AGE
busybox-sleep   1/1     Terminating   0          2m
$ kubectl describe pod busybox-sleep -n test
Name:                      busybox-sleep
Namespace:                 test
Priority:                  0
Node:                      ip-10-250-5-86.eu-central-1.compute.internal/10.250.5.86
Start Time:                Thu, 23 Jan 2020 10:15:38 +0100
Labels:                    <none>
Annotations:               cni.projectcalico.org/podIP: 100.96.0.55/32
                           kubectl.kubernetes.io/last-applied-configuration:
                             {"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"busybox-sleep","namespace":"test"},"spec":{"containers":[{"args":["sl...
                           kubernetes.io/psp: gardener.kube-system.calico
Status:                    Terminating (lasts 29s)
Termination Grace Period:  30s
IP:                        100.96.0.55
IPs:
  IP:  100.96.0.55
Containers:
  busybox:
    Container ID:  docker://93d00d63bbed2e5a727205e3a7639192daaf4ab7ed801496f57d5aa53e5352ff
    Image:         busybox
    Image ID:      docker-pullable://busybox@sha256:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a
    Port:          <none>
    Host Port:     <none>
    Args:
      sleep
      1000000
    State:          Running
      Started:      Thu, 23 Jan 2020 10:15:40 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:         <none>
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:            <none>
QoS Class:          BestEffort
Node-Selectors:     <none>
Tolerations:        node.kubernetes.io/not-ready:NoExecute for 300s
                    node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age        From                                                   Message
  ----    ------     ----       ----                                                   -------
  Normal  Scheduled  <unknown>  default-scheduler                                      Successfully assigned test/busybox-sleep to ip-10-250-5-86.eu-central-1.compute.internal
  Normal  Pulling    2m23s      kubelet, ip-10-250-5-86.eu-central-1.compute.internal  Pulling image "busybox"
  Normal  Pulled     2m22s      kubelet, ip-10-250-5-86.eu-central-1.compute.internal  Successfully pulled image "busybox"
  Normal  Created    2m22s      kubelet, ip-10-250-5-86.eu-central-1.compute.internal  Created container busybox
  Normal  Started    2m22s      kubelet, ip-10-250-5-86.eu-central-1.compute.internal  Started container busybox
  Normal  Killing    59s        kubelet, ip-10-250-5-86.eu-central-1.compute.internal  Stopping container busybox

Environment