kubernetes-sigs / image-builder

Tools for building Kubernetes disk images
https://image-builder.sigs.k8s.io/
Apache License 2.0
401 stars 395 forks source link

Protect containerd processes from getting oomkilled #112

Closed dims closed 4 years ago

dims commented 4 years ago

The Kubernetes kubelet's dockershim sets oom_score_adj for the docker processes to -999 to protect them from getting killed: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/container_manager_linux.go#L774-L796

However other CRIs like containerd, kubelet does not know about the names of the processes or their pid and hence does NOT set the oom_score_adj: https://github.com/kubernetes/kubernetes/issues/86420

The guidance from the containerd folks is for packagers/admins to do this themselves: https://github.com/containerd/containerd/issues/3901

Since we ship containerd by default and we install containerd in all our images, we should set this ourselves by default in image-builder itself.

One pattern of setting this using ansible is (found quickly using google search as i don't know much about ansible, so there may be other patterns): https://chuckyz.wordpress.com/2016/12/28/centos-7-disabling-oomkiller-for-a-process/

Let's please do this!

detiber commented 4 years ago

We're already talking about adding a systemd drop-in file for containerd to help with issues we are seeing around bootstrapping race conditions, seems like it would be relatively simple to also add:

[Service]
OOMScoreAdjust=-999

to that drop-in file as well.

@figo @akutz thoughts?

detiber commented 4 years ago

kubernetes-sigs/cluster-api#1714 is the race condition issue I was referencing

dims commented 4 years ago

@detiber works for me!

akutz commented 4 years ago

Hi @figo,

For your convenience, here's the documentation for OOMScoreAdjust:

Sets the adjustment value for the Linux kernel's Out-Of-Memory (OOM) killer score for executed processes. Takes an integer between -1000 (to disable OOM killing of processes of this unit) and 1000 (to make killing of processes of this unit under memory pressure very likely). See proc.txt for details. If not specified defaults to the OOM score adjustment level of the service manager itself, which is normally at 0.

Use the OOMPolicy= setting of service units to configure how the service manager shall react to the kernel OOM killer terminating a process of the service. See systemd.service(5) for details.

akutz commented 4 years ago

Hi all,

Please see https://github.com/kubernetes-sigs/cluster-api/issues/1714#issuecomment-567942842 for approach I suggested to @figo. I'd like to keep the two configurations defined in distinct drop-ins.

figo commented 4 years ago

Issue should be addressed with https://github.com/kubernetes-sigs/image-builder/pull/113, thanks

/close

k8s-ci-robot commented 4 years ago

@figo: Closing this issue.

In response to [this](https://github.com/kubernetes-sigs/image-builder/issues/112#issuecomment-568133754): >Issue should be addressed with https://github.com/kubernetes-sigs/image-builder/pull/113, thanks > >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.