Open attlee-wang opened 1 year ago
I also noticed issues #3014, but after my investigation, the reason is different.
I add log to housekeeping() found that add housekeeping() is much more than stop housekeeping(), but there are only a few containers on my node.
Adding logs to the code, I found that the lastWatched of the failed container is always false, so it is not sent stop housekeeping(). But I didn't find why lastWatched is always false of the failed container. @bobbypage @iwankgb
Kubernetes v1.18.5 is using cadvisor v0.35.0.
@pacoxu is this version of Kubernetes still supported?
What happened?
In my online kubernetes cluster, kubelet memory keeps growing, finally more than 50G, and kill many low-priority processes with memory.
I observed goroutines of kubelet are also increasing synchronously with the memory:
I use golang pprof analysis and found that many goroutines stay in the housekeeping logic
housekeeping() goroutines have only one exit point, which is read <-c.stop chan massage. But I didn't find anything unusual by checking the kebelet log, so why housekeeping() goroutines do keep growing?
What did you expect to happen?
Housekeeping() goroutines exit normally and find out why it can't exit normally.
How can we reproduce it (as minimally and precisely as possible)?
I don't know reproduce it, restart the kubelet and the problem will disappear
Anything else we need to know?
No response
Kubernetes version
Cloud provider
OS version
Install tools
Container runtime (CRI) and version (if applicable)
Related plugins (CNI, CSI, ...) and versions (if applicable)