oomichi / try-kubernetes

12 stars 5 forks source link

Need some event which shows container exited with non-zero return code #129

Closed oomichi closed 10 months ago

oomichi commented 1 year ago

All events are defined on https://github.com/kubernetes/kubernetes/blob/0e077bb7ac898555b7bb968fee8115aa738bde34/pkg/kubelet/events/event.go

oomichi commented 1 year ago

Normal deployment events:

4m37s       Normal   Pulling                   pod/nginx-deployment-7759cfdc55-2xczh    Pulling image "nginx:1.7.9"
4m21s       Normal   Pulled                    pod/nginx-deployment-7759cfdc55-2xczh    Successfully pulled image "nginx:1.7.9" in 16.244892095s
4m21s       Normal   Created                   pod/nginx-deployment-7759cfdc55-2xczh    Created container nginx
4m20s       Normal   Started                   pod/nginx-deployment-7759cfdc55-2xczh    Started container nginx
4m38s       Normal   SuccessfulCreate          replicaset/nginx-deployment-7759cfdc55   Created pod: nginx-deployment-7759cfdc55-2xczh
4m38s       Normal   ScalingReplicaSet         deployment/nginx-deployment              Scaled up replica set nginx-deployment-7759cfdc55 to 1
oomichi commented 1 year ago

Pod which exits after 10 seconds:

$ cat pod-busybox.yaml 
apiVersion: v1
kind: Pod
metadata:
  name: pod-busybox
spec:
  containers:
  - name: mycontainer
    image: busybox
    command: ["sleep", "10"]

and the events are:

20s         Normal    Pulling                   pod/pod-busybox                          Pulling image "busybox"
95s         Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 6.943284524s
18s         Normal    Created                   pod/pod-busybox                          Created container mycontainer
18s         Normal    Started                   pod/pod-busybox                          Started container mycontainer
83s         Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.106825944s
8s          Warning   BackOff                   pod/pod-busybox                          Back-off restarting failed container
57s         Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.273737356s
18s         Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.099255744s
oomichi commented 1 year ago

the default kubectl get event is not sorted by timestamp, need to specify --sort-by:

$ kubectl get events --sort-by=.lastTimestamp
LAST SEEN   TYPE      REASON                    OBJECT                                   MESSAGE
7m33s       Normal    Scheduled                 pod/pod-busybox                          Successfully assigned default/pod-busybox to kind-control-plane
7m25s       Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 6.943284524s
7m13s       Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.106825944s
6m47s       Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.273737356s
6m8s        Normal    Started                   pod/pod-busybox                          Started container mycontainer
6m8s        Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 2.099255744s
5m17s       Normal    Pulling                   pod/pod-busybox                          Pulling image "busybox"
5m15s       Normal    Created                   pod/pod-busybox                          Created container mycontainer
5m15s       Normal    Pulled                    pod/pod-busybox                          Successfully pulled image "busybox" in 1.969457928s
2m20s       Warning   BackOff                   pod/pod-busybox                          Back-off restarting failed container
oomichi commented 1 year ago
  1. Pulling: events.PullingImage(Pulling) is recored in EnsureImageExists() of image_manager.go
  2. Pulled: events.PulledImage(Pulled) is recorded in EnsureImageExists() of image_manager.go
  3. Created: events.CreatedContainer(Created) is recorted in startContainer() of kuberuntime_container.go

Call sequence:

startContainer() in kuberuntime/kuberuntime_container.go

  1. Call m.imagePuller.EnsureImageExists()
  2. -> events.PullingImage(Pulling) is recored in EnsureImageExists()
  3. -> events.PulledImage(Pulled) is recorded in EnsureImageExists()
  4. events.CreatedContainer(Created) is recorted
oomichi commented 1 year ago

kubelet creates a container in a pod with the following steps in SyncPod():

// SyncPod syncs the running pod into the desired pod by executing following steps:
//
//  1. Compute sandbox and container changes.
//  2. Kill pod sandbox if necessary.
//  3. Kill any containers that should not be running.
//  4. Create sandbox if necessary.
//  5. Create ephemeral containers.
//  6. Create init containers.
//  7. Resize running containers (if InPlacePodVerticalScaling==true)
//  8. Create normal containers.

and events are recorded only for creating a pod by calling startContainer().

oomichi commented 1 year ago

Creating k8s env from Kubespray..

oomichi commented 1 year ago

When killing a container in a pod, the log of kubelet(journal -u kubelet) contains

Mar 06 02:39:25 k8s-kubespray kubelet[749]: I0306 02:39:25.544143     749 generic.go:296] "Generic (PLEG): container finished" podID=7fa1d0ed-6457-4bb9-98e5-0c005f88536e containerID="4d6461bb7e043106281c1bcbeaab26f76b7e861ff6b8e59516d0e53da9a34a1a" exitCode=137
Mar 06 02:39:25 k8s-kubespray kubelet[749]: I0306 02:39:25.544175     749 kubelet.go:2134] "SyncLoop (PLEG): event for pod" pod="default/pod-busybox" event=&{ID:7fa1d0ed-6457-4bb9-98e5-0c005f88536e Type:ContainerDied Data:4d6461bb7e043106281c1bcbeaab26f76b7e861ff6b8e59516d0e53da9a34a1a}
oomichi commented 1 year ago

Does containerStatuses.state.terminated.reason show OOMKilled happen? Do we need to add more info into that to know more details?

oomichi commented 1 year ago

It is fine to keep OOMKilled as is. I just need to concentrate on outputting non-zero return code into Events with some details.