High Memory Usage on Node that Kube-State-Metrics is deployed too

mitchellmaler commented 4 years ago

Description

We are deploying NRI using the NRI Bundle Helm Chart which deploys kube-state-metrics and nri with nri-kubernetes. By default the chart has the memory limit set to 300Mi which we have increased to 500Mi. Unfortunately even that isn't enough and the NRI pod running on the node that kube-state-metrics is running on keeps using much more memory and ends up getting OOM killed.

You can see here the difference between two running instances where the high memory pod is the node that kube-state-metrics is running.

newrelic-bundle-newrelic-infrastructure-l2x86               4m           27Mi
newrelic-bundle-newrelic-infrastructure-l8lkv               435m         444Mi

Here shows they are running on the same node and the nri pod keeps getting oom kiled.

newrelic-bundle-newrelic-infrastructure-l8lkv               1/1     Running     1349       6d      10.182.2.31       stg-kw3-c1-09   <none>           <none>
newrelic-bundle-kube-state-metrics-6bdb969776-zrmwd         1/1     Running     0          6d      192.168.108.12    stg-kw3-c1-09   <none>           <none>

Expected Behavior

Not to use so much memory or a different way to run the nri and kube-state-metrics pods on targeted nodes. (this would mean an issue on the chart repo so if this ends up just needing an arch change I can log an issue there)

Your Environment

Image: newrelic/infrastructure-k8s:1.26.1 Nodes: 52 K8s Version: v1.17.9

burdzwastaken commented 3 years ago

we are seeing the exact same behaviour across our fleet. in larger clusters the node that has kube-state-metrics scheduled on it causes the nri-kubernetes integration to OOM consistently. every other pod in the Daemonset has no issues and does not get restarted continuously. we are not seeing this in smaller clusters of around (10-15 nodes)

k exec -it newrelic-infra-8cq2m -- /var/db/newrelic-infra/newrelic-integrations/bin/nri-kubernetes -verbose
WARN[0000] Environment variable NRIA_CACHE_PATH is not set, using default /tmp/nri-kubernetes.json 
WARN[0000] Cache file (/tmp/nri-kubernetes.json) is older than 1m0s, skipping loading from disk. 
DEBU[0000] Integration "com.newrelic.kubernetes" with version 2.0.0 started 
DEBU[0000] Found cached copy of "defaultNetworkInterface" with value 'eth0' stored at 2021-02-10 15:13:50 +0000 UTC 
DEBU[0000] Found cached copy of "kubelet-client" stored at 2021-02-10 15:13:50 +0000 UTC 
DEBU[0000] Kubelet node IP = 10.218.66.75               
DEBU[0000] Discovering KSM using static endpoint (KUBE_STATE_METRICS_URL) 
DEBU[0000] Found cached copy of "ksm-client" stored at 2021-02-10 15:13:52 +0000 UTC 
DEBU[0000] KSM Node = 10.218.66.75                      
DEBU[0000] Running job: kube-state-metrics              
DEBU[0000] Calling kube-state-metrics endpoint: http://kube-state-metrics.kube-system.svc.cluster.local:8080/metrics 
command terminated with exit code 137

our environment: nodes cluster1: 56 nodes cluster2: 71 nodes cluster3: 346

version: newrelic/infrastructure-k8s:2.0.0@sha256:4d7cbb16f41c1fcee85697faabdbecf31c8c78673ae24aaedd612b250680fb01 k8s version: 1.17.9

happy to provide more information to debug this issue

paologallinaharbur commented 2 years ago

New arch is aiming to solve this issue. Basically "KSM scraping" will be extracted to a different component, in this way we can size memory and cpu request as needed without increasing them for all instances.

Later on if still needed we can focus on further optimize it

joshsleeper commented 2 years ago

is there a merged PR that can be seen where this is implemented, or has a release been cut with the change?

imo it'd be nice for issues like this to stay open until the feature/fix is actually available to folks.

paologallinaharbur commented 2 years ago

Sure my bad, we can keep this open, I closed it thinking it was a stale issue.

Basically all the work, regarding the nri-kubernetes binary, is going on in the branch "develop". Since there are a lot of changes in the codebase is a bit difficult to point exactly to the commit implementing this feature considering that the whole architecture changed.

It is easier to check how it is going to be deployed. Here you can see that the component scraping KSM is now separated and aiming to do only that. ControlPlane and Kubelet scraping are now separated as well, allowing to set resources as needed.

The code is almost ready, but you are 100% right: it is not released yet. I will keep this open till there is public documentation showing the new features and everything is public

Moreover nri-kubernetes will be now the main process and in case of OOM it will be way easier to detect them and solve the issue.

joshsleeper commented 2 years ago

that all looks and sounds fantastic, thanks a ton for everyone's hard work on the arch refactor!

paologallinaharbur commented 2 years ago

Closing this you can check all the info in the official docs together with a migration guide!

joshsleeper commented 2 years ago

That sounds awesome, thanks a ton for the follow up!

paologallinaharbur commented 2 years ago

we are currently in beta, but soon we are going GA :)

newrelic / nri-kubernetes