Closed Lyt99 closed 11 months ago
/remove-kind bug Hi, let us wait until we get some helpful information that hints at a bug. Also, please provide the information asked in the issue template.
We have been making changes for performance and very soon we will be releasing a build that has changed components of the controller. But if you test the current latest release and update as per issue template, it will help get a better perspective.
/triage needs-information
Hi, I have the same issue:
nginx -s reload
temporary solves the issue.
Here is my infos:
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.):
NGINX Ingress controller Release: v0.47.0 Build: 7201e37633485d1f14dbe9cd7b22dd380df00a07 Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.20.1
Kubernetes version (use kubectl version
):
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/amd64"} Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.9-gke.1001", GitCommit:"1fe18c314ed577f6047d2712a9d1c8e498e22381", GitTreeState:"clean", BuildDate:"2021-08-23T23:06:28Z", GoVersion:"go1.15.13b5", Compiler:"gc", Platform:"linux/amd64"}
Environment:
uname -a
):
Linux ingress-nginx-controller-788c5f7f88-d94pj 5.4.120+ #1 SMP Tue Jun 22 14:53:20 PDT 2021 x86_64 LinuxHelm: helm -n ingress-nginx get values ingress-nginx USER-SUPPLIED VALUES:
controller:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- nginx-ingress
topologyKey: kubernetes.io/hostname
weight: 100
config:
use-gzip: true
metrics:
enabled: true
serviceMonitor:
additionalLabels:
release: kube-prometheus-stack
enabled: true
namespace: monitoring
replicaCount: 2
resources:
requests:
memory: 800Mi
service:
externalTrafficPolicy: Local
kubectl describe po -n ingress-nginx ingress-nginx-controller-788c5f7f88-d94pj
Name: ingress-nginx-controller-788c5f7f88-d94pj
Namespace: ingress-nginx
Priority: 0
Node: gke-production-pool-1-66bb3111-sldn/10.132.0.4
Start Time: Sat, 18 Sep 2021 17:17:13 +0200
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
pod-template-hash=788c5f7f88
Annotations: kubectl.kubernetes.io/restartedAt: 2021-09-18T17:17:13+02:00
Status: Running
IP: 10.52.3.39
IPs:
IP: 10.52.3.39
Controlled By: ReplicaSet/ingress-nginx-controller-788c5f7f88
Containers:
controller:
Container ID: containerd://74fb58bce33d84fb54fb61a3a16772d6edf8858cc14a05c21d0feb79a90e8157
Image: k8s.gcr.io/ingress-nginx/controller:v0.47.0@sha256:a1e4efc107be0bb78f32eaec37bef17d7a0c81bec8066cdf2572508d21351d0b
Image ID: k8s.gcr.io/ingress-nginx/controller@sha256:a1e4efc107be0bb78f32eaec37bef17d7a0c81bec8066cdf2572508d21351d0b
Ports: 80/TCP, 443/TCP, 10254/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
State: Running
Started: Sat, 18 Sep 2021 17:17:14 +0200
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 800Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-788c5f7f88-d94pj (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from ingress-nginx-token-cn2nx (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
ingress-nginx-token-cn2nx:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-token-cn2nx
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
kubectl describe svc -n ingress-nginx ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: ingress-nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/version=0.47.0
helm.sh/chart=ingress-nginx-3.34.0
Annotations: cloud.google.com/neg: {"ingress":true}
meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ingress-nginx
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Families: <none>
IP: 10.56.2.89
IPs: 10.56.2.89
LoadBalancer Ingress: xxx.xxx.xxx.xxx
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31463/TCP
Endpoints: 10.52.3.39:80,10.52.4.31:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 30186/TCP
Endpoints: 10.52.3.39:443,10.52.4.31:443
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 30802
Events: <none>
/priority critical-urgent I will look with other possible "leak" that is happening.
I have received the suggestion to test using boringSSL instead of OpenSSL when building the image (for FIPS compliance, etc) maybe we can try that as well
I have the same memory leak issue with latest version:
bash-5.1$ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.0.2
Build: 2b8ed4511af75a7c41e52726b0644d600fc7961b
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.19.9
-------------------------------------------------------------------------------
Folks,
in case I generate an image of 0.49.3 (to be released) with Openresty OpenSSL patch applied, are you able to test and provide some feedback on that?
/kind bug /triage accepted
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closing this issue.
+1 still happening
/reopen /lifecycle frozen
@strongjz: Reopened this issue.
This issue is labeled with priority/critical-urgent
but has not been updated in over 30 days, and should be re-triaged.
Critical-urgent issues must be actively worked on as someone's top priority right now.
You can:
/triage accepted
(org members only)/priority {important-soon, important-longterm, backlog}
/close
For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/
/remove-triage accepted
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
/close
@rikatz: Closing this issue.
NGINX Ingress controller version (exec into the pod and run nginx-ingress-controller --version.): 0.44.0 & 0.49.0
Kubernetes version (use
kubectl version
): 1.18.8Environment:
uname -a
): 4.19.91-23.al7.x86_64What happened:
We've encountered some memory issue both in 0.44.0 and 0.49.0 Some of the ingress pods get a high memory usage, but others are ina normal level
We did sone diagnose to the pod, and it shows that one of the nginx worker gained a large amount of memory.
the income traffic is balance, about 100 requests per second, and the connection count between pods is of the same order of magnitude (from 10k+ to 100k+).
And then, we use
pmap -x <pid>
to get details of the memory. There were lots of tiny anon blocks in the memory map.Made a coredump and took a look at this memory area, most of its content seems to be related to TLS certs. And also we tried to run memleak on the process, and result here:
here are more samples m.log
Finally we moved the cert to the load balancer provided by cloud, and it's working fine now, but still have no clue about why could this happen.
The leak is happened on nginx and connection with TLS. We tried to rebuild the image to upgrade libraries to the newest version (for openssl, 1.1.1l-r0), but it doesn't work.
What you expected to happen:
no memory leak with TLS
How to reproduce it:
I have no idea what makes the issue happen, and I can't reproduce it on another cluster.
Anything else we need to know:
As far, we haven't met this issue with 0.30.0 (openssl 1.1.1d-r3), I don't know whether it's a problem in newer openssl.
/kind bug