bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.78k stars 519 forks source link

Allow customizing the parameters in the kube_reserve_memory() function #4187

Open diranged opened 2 months ago

diranged commented 2 months ago

What I'd like:

We have a tendency to over-subscribe our nodes ... allowing pods to have higher limits.memory settings than their requests.memory. We expect/hope that processes inside containers will be OOMKilled before critical system processes (like kubelet) are ... but our experience is showing that kubelet often gets OOMKill events first.

I know that we can hard code a new memory reservation for the kubelet process.. but what I really want is to be able to alter the parameters that go into the https://github.com/bottlerocket-os/bottlerocket/blob/a63007cfbd44beea85596fe7f8eac98643cd3d7d/sources/api/schnauzer/src/helpers.rs#L1079-L1081 function.

I would like parameters that allow us to change the 11 and 255 numbers, while retaining the general dynamic calculation. That would allow us to tweak these numbers to fit our environment, without losing the dynamic nature of the configuration.

Any alternatives you've considered:

I have considered trying to do this in a bootstrap container... I just hate to do all that work when I think a few parameters could be added in here and this could then be easy.

stevehipwell commented 1 month ago

@diranged this looks to be a duplicate of https://github.com/bottlerocket-os/bottlerocket/issues/1721.

diranged commented 1 month ago

It's not really the same ... because that PR would allow us to maybe change the max_num_pods, but not change the fundamental calculation of how much memory we want to reserve per pod. I want to see the 11 an 255 settings be customizable.

stevehipwell commented 1 month ago

@diranged #1721 covers more than the title suggests but I can see how this change could be related or could be independent; I think the main thing here is that API changes for kube reserved should consider all of the requested functionality before being implemented.

Also I'm interested if you've attempted to set max pods to 110, as is the upstream Kubernetes max, and then defined a fixed kube reserved value to resolve your issue?