kubernetes / ingress-nginx

Ingress NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
17.5k stars 8.26k forks source link

Memory Issues with ingress-nginx Helm Chart Version 4.11.2 #11987

Open pguptajsq opened 1 month ago

pguptajsq commented 1 month ago

We are experiencing significant memory issues after upgrading to the ingress-nginx Helm chart version 4.11.2. The memory usage has increased substantially, leading to performance degradation and instability in our applications. Process nginx (pid: 3860819) triggered an OOM kill on process nginx (pid: 3128340, oom_score: 2086003, oom_score_adj: 936). The process had reached 2097152 pages in size.

This OOM kill was invoked by a cgroup, containerID: 910ca10fd008793d586b37a3704bcf1dee6656d3c151fb77ce353fbc76647d68.

before upgrade we had set 4GB memory but after upgrade we have increased to 6GB but still OOM kill stopped all the nginx pod

k8s-ci-robot commented 1 month ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 1 month ago

/remove-kind /kind support

It would be best course of action to trace the process and such details. Since this can not be reproduced at will on minikube, there is no action that others can take.

What have you debugged so far. Please look at the release notes and change log for v1.11.2 and see which of your used features, if any, are related. There could be changes that are causing retries or zombies so you need to trace the process on the container and the host.

k8s-ci-robot commented 1 month ago

@longwuyuan: Those labels are not set on the issue: kind/support

In response to [this](https://github.com/kubernetes/ingress-nginx/issues/11987#issuecomment-2358157368): >/remove-kind >/kind support > >It would be best course of action to trace the process and such details. Since this can not be reproduced at will on minikube, there is no action that others can take. > >What have you debugged so far. Please look at the release notes and change log for v1.11.2 and see which of your used features, if any, are related. There could be changes that are causing retries or zombies so you need to trace the process on the container and the host. Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.
pguptajsq commented 1 month ago

/kind support

We have only k8s events Process nginx (pid: 3860819) triggered an OOM kill on process nginx (pid: 3128340, oom_score: 2086003, oom_score_adj: 936). The process had reached 2097152 pages in size.

This OOM kill was invoked by a cgroup, containerID: 910ca10fd008793d586b37a3704bcf1dee6656d3c151fb77ce353fbc76647d68.

I0917 22:49:06.256254 7 sigterm.go:47] "Exiting" code=0 I0917 22:48:40.115050 7 sigterm.go:36] "Received SIGTERM, shutting down"

longwuyuan commented 1 month ago

We can ack that that is all you have and so you seek support.

But you also need to ack that some data is needed to take some action by others. Since this can not be reproduced on a kind cluster or a minikube cluster, you are sort of stuck with the action to trace the process now or later or anytime. Your trace should look for signs and suspects of memory consumption. That is in container OS processes and threads. Some people who faced same issue used strace/ptrace type of tools. You can search the issues for strace or OOM etc.

longwuyuan commented 1 month ago

/remove-kind bug

github-actions[bot] commented 3 weeks ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.