kubernetes / ingress-nginx

Ingress-NGINX Controller for Kubernetes
https://kubernetes.github.io/ingress-nginx/
Apache License 2.0
16.98k stars 8.14k forks source link

Random backend reloads - "unable to decode an event from the watch stream: http2: client connection lost" #10022

Open florinciubotariu opened 1 year ago

florinciubotariu commented 1 year ago

What happened: Out of the sudden, NGINX randomly reloads, causing problems on one app which has permanent WebSocket connections with multiple clients. This is happening rarely (1 - 2 times / week).

What you expected to happen: NGINX should not reload if no changes were made.

NGINX Ingress controller version:

NGINX Ingress controller Release: v1.7.0 Build: 72ff21ed9e26cb969052c753633049ba8a87ecf9 Repository: https://github.com/kubernetes/ingress-nginx nginx version: nginx/1.21.6

Environment:

Logs from controller 1:

W0528 20:46:52.489468       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.489566       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.489568       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Ingress ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.489597       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.489642       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.489678       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.IngressClass ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E0528 20:46:52.489673       8 leaderelection.go:330] error retrieving resource lock ingress/ingress-nginx-leader: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/ingress/leases/ingress-nginx-leader": http2: client connection lost
W0528 20:46:53.332639       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:53.332764       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:53.332879       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.667130       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.667284       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.667371       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.667450       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.667738       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 21
W0528 20:46:56.667781       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 59860
I0528 20:46:56.667988       8 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:46:56.877235       8 controller.go:206] "Backend successfully reloaded"
I0528 20:46:56.878791       8 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-5xzs7", UID:"2157f534-de7d-4695-8c2a-71b32a6ae509", APIVersion:"v1", ResourceVersion:"18839937", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0528 20:46:59.999852       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:47:00.000297       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:47:00.000885       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 21
W0528 20:47:00.000961       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 59860
I0528 20:47:03.334027       8 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:47:03.655772       8 controller.go:206] "Backend successfully reloaded"
I0528 20:47:03.660141       8 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-5xzs7", UID:"2157f534-de7d-4695-8c2a-71b32a6ae509", APIVersion:"v1", ResourceVersion:"18839937", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration

Logs from controller 2:

W0528 20:46:49.182708       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Ingress ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:49.182838       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:49.182916       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:49.183001       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:49.183043       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:49.183071       8 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.IngressClass ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E0528 20:46:49.182720       8 leaderelection.go:330] error retrieving resource lock ingress/ingress-nginx-leader: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/ingress/leases/ingress-nginx-leader": http2: client connection lost
W0528 20:46:50.150400       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:53.483944       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:53.484198       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:53.484384       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.817599       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.817858       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.817928       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.817981       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.826565       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 21
W0528 20:46:56.826611       8 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 59860
I0528 20:46:56.826839       8 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:46:56.962074       8 controller.go:206] "Backend successfully reloaded"
I0528 20:46:56.962863       8 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-jb9g8", UID:"c073c0fa-16b6-4698-88bf-5c8bc3abfa8e", APIVersion:"v1", ResourceVersion:"18839949", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0528 20:47:00.151088       8 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
I0528 20:47:00.151530       8 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:47:00.287373       8 controller.go:206] "Backend successfully reloaded"
I0528 20:47:00.288077       8 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-jb9g8", UID:"c073c0fa-16b6-4698-88bf-5c8bc3abfa8e", APIVersion:"v1", ResourceVersion:"18839949", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration

Logs from controller 3:

W0528 20:47:01.367735       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:47:01.367853       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.IngressClass ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:47:01.367891       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E0528 20:47:01.367889       7 leaderelection.go:330] error retrieving resource lock ingress/ingress-nginx-leader: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/ingress/leases/ingress-nginx-leader": http2: client connection lost
W0528 20:47:01.367928       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Ingress ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:47:01.367941       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:47:01.367963       7 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding

Logs from controller 4:

E0528 20:46:28.236799       9 leaderelection.go:330] error retrieving resource lock ingress/ingress-nginx-leader: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/ingress/leases/ingress-nginx-leader": context deadline exceeded
I0528 20:46:28.236866       9 leaderelection.go:283] failed to renew lease ingress/ingress-nginx-leader: timed out waiting for the condition
I0528 20:46:28.237646       9 leaderelection.go:248] attempting to acquire leader lease ingress/ingress-nginx-leader...
W0528 20:46:51.677916       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Service ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:51.678031       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.IngressClass ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:51.678110       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Secret ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
E0528 20:46:51.678171       9 leaderelection.go:330] error retrieving resource lock ingress/ingress-nginx-leader: Get "https://10.0.0.1:443/apis/coordination.k8s.io/v1/namespaces/ingress/leases/ingress-nginx-leader": http2: client connection lost
W0528 20:46:51.678210       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.ConfigMap ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:51.678210       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.EndpointSlice ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:51.678258       9 reflector.go:347] k8s.io/client-go@v0.26.3/tools/cache/reflector.go:169: watch of *v1.Ingress ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding
W0528 20:46:52.808478       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:52.808649       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.142520       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.142619       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.142768       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.142917       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:56.143177       9 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 21
W0528 20:46:56.143208       9 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 59860
I0528 20:46:56.143364       9 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:46:56.271806       9 controller.go:206] "Backend successfully reloaded"
I0528 20:46:56.272502       9 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-mtfgn", UID:"a6cf13c7-2253-46ae-98d7-683f150474b6", APIVersion:"v1", ResourceVersion:"18840148", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
W0528 20:46:59.475975       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:59.476056       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:59.476111       9 controller.go:1151] Service "default/hidden" does not have any active Endpoint.
W0528 20:46:59.476284       9 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 21
W0528 20:46:59.476314       9 controller.go:485] Service "default/hidden" does not have any active Endpoint for TCP port 59860
I0528 20:47:02.809099       9 controller.go:189] "Configuration changes detected, backend reload required"
I0528 20:47:02.945153       9 controller.go:206] "Backend successfully reloaded"
I0528 20:47:02.945909       9 event.go:285] Event(v1.ObjectReference{Kind:"Pod", Namespace:"ingress", Name:"ingress-nginx-controller-7f9857f796-mtfgn", UID:"a6cf13c7-2253-46ae-98d7-683f150474b6", APIVersion:"v1", ResourceVersion:"18840148", FieldPath:""}): type: 'Normal' reason: 'RELOAD' NGINX reload triggered due to a change in configuration
I0528 20:47:04.064124       9 leaderelection.go:258] successfully acquired lease ingress/ingress-nginx-leader

Thanks!

k8s-ci-robot commented 1 year ago

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
longwuyuan commented 1 year ago

/remove-kind bug

Please edit the description and kindly answer all the questions that are asked in a new issue template.

There is a error message about missing endpoint but since your issue description dos not contain answers to all the questions asked in a new issue template, there is no data to analyse that message or the random reload that you are reporting.

Ideally writing a step-by-step procedure to reproduce the problem on a kind cluster or a minikube cluster helps a lot in debugging. If not, it likely can also mean there is some cause specific to your environment and your cluster that is resulting in the issue reported.

longwuyuan commented 1 year ago

/triage needs-information

github-actions[bot] commented 12 months ago

This is stale, but we won't close it automatically, just bare in mind the maintainers may be busy with other tasks and will reach your issue ASAP. If you have any question or request to prioritize this, please reach #ingress-nginx-dev on Kubernetes Slack.

granowski-bamboo commented 1 week ago

This issue seems to still be valid. At the company I work for, we're experiencing an issue when load testing a Java API, nginx seems to just cut out just like described in opening comment. Same setup, using Kubernetes, Ingress, Nginx...