ingress controller reloading backend regularly (between 30s-3m)

ironslob commented 6 years ago

I've run the ingress with --v=2 to see what's happening and get the following, which always seems to be the same:

104.156.229.24 - [104.156.229.24] - - [25/Jun/2018:15:26:49 +0000] "GET /pro/ HTTP/1.1" 200 4180 "https://www.twigdoo.com/" "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/98 Safari/537.4 (StatusCake)" 522 1.683 [apps-twigdoo-web] 172.20.84.227:5000 16631 1.684 200 f5981fc9e19566a250f13d8fa7cbce60
104.238.159.87 - [104.238.159.87] - - [25/Jun/2018:15:26:54 +0000] "HEAD / HTTP/1.1" 307 0 "-" "updown.io daemon 2.2" 216 0.003 [apps-twigdoo-web] 172.20.127.53:5000 0 0.004 307 00d6edaaba8425444128740683ab5e51
104.238.159.87 - [104.238.159.87] - - [25/Jun/2018:15:26:54 +0000] "HEAD /pro/ HTTP/1.1" 200 0 "-" "updown.io daemon 2.2" 220 0.024 [apps-twigdoo-web] 172.20.84.227:5000 0 0.024 200 fe2bfb80727df5e8c8133d9b1842df81
138.68.24.60 - [138.68.24.60] - - [25/Jun/2018:15:26:56 +0000] "GET /services/weddings/ HTTP/1.1" 200 3919 "-" "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/98 Safari/537.4 (StatusCake)" 499 0.020 [apps-twigdoo-web] 172.20.127.53:5000 19261 0.020 200 b8a8335581c3f4938587857c06f9cc01
I0625 15:27:01.584519       6 controller.go:169] Configuration changes detected, backend reload required.
I0625 15:27:01.584543       6 util.go:68] rlimit.max=1048576
I0625 15:27:01.584568       6 nginx.go:522] Maximum number of open file descriptors: 523264
I0625 15:27:01.656091       6 nginx.go:629] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-06-25 15:26:12.608700061 +0000
+++ /tmp/new-nginx-cfg220593655 2018-06-25 15:27:01.652822169 +0000
@@ -213,6 +213,7 @@

                server 172.20.84.227:5000 max_fails=0 fail_timeout=0;
                server 172.20.127.53:5000 max_fails=0 fail_timeout=0;
+               server 172.20.116.105:5000 max_fails=0 fail_timeout=0;

        }

I0625 15:27:01.699733       6 controller.go:179] Backend successfully reloaded.
84.201.133.36 - [84.201.133.36] - - [25/Jun/2018:15:27:04 +0000] "GET /sitemap/england/south-west/cornwall/brunnion/ HTTP/1.1" 200 2936 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)" 345 0.103 [apps-twigdoo-web] 172.20.84.227:5000 9041 0.104 200 8156bb867d872abc6ac37897a3ce3b85
I0625 15:27:04.917910       6 controller.go:169] Configuration changes detected, backend reload required.
I0625 15:27:04.917931       6 util.go:68] rlimit.max=1048576
I0625 15:27:04.917937       6 nginx.go:522] Maximum number of open file descriptors: 523264
I0625 15:27:04.965493       6 nginx.go:629] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-06-25 15:27:01.652822169 +0000
+++ /tmp/new-nginx-cfg538364737 2018-06-25 15:27:04.960830406 +0000
@@ -211,9 +211,8 @@

                keepalive 32;

-               server 172.20.84.227:5000 max_fails=0 fail_timeout=0;
-               server 172.20.127.53:5000 max_fails=0 fail_timeout=0;
                server 172.20.116.105:5000 max_fails=0 fail_timeout=0;
+               server 172.20.84.227:5000 max_fails=0 fail_timeout=0;

        }

I0625 15:27:05.007378       6 controller.go:179] Backend successfully reloaded.

Any help on resolving this would be great, as I'm seeing regular 502 responses.

ironslob commented 6 years ago

I found upstream annotations and thought I'd try these:

+    nginx.ingress.kubernetes.io/upstream-fail-timeout: "30"
+    nginx.ingress.kubernetes.io/upstream-max-fails: "5"

but no joy, same thing is happening.

54.172.193.230 - [54.172.193.230] - - [25/Jun/2018:15:44:09 +0000] "GET /sitemap/wales/dyfed/ceredigion-sir-ceredigion/llywernog/ HTTP/1.1" 200 2949 "-" "MauiBot (crawler.feedback+wc@gmail.com)" 324 1.872 [apps-twigdoo-web] 172.20.127.53:5000 9112 1.868 200 45f49e1b9bef9e5540bcdaf1df577264
I0625 15:44:22.533945       6 controller.go:169] Configuration changes detected, backend reload required.
I0625 15:44:22.548688       6 util.go:68] rlimit.max=1048576
I0625 15:44:22.548768       6 nginx.go:522] Maximum number of open file descriptors: 523264
I0625 15:44:22.688089       6 nginx.go:629] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-06-25 15:43:35.579306845 +0000
+++ /tmp/new-nginx-cfg201428551 2018-06-25 15:44:22.683427020 +0000
@@ -211,9 +211,8 @@

                keepalive 32;

-               server 172.20.84.227:5000 max_fails=5 fail_timeout=30;
-               server 172.20.116.105:5000 max_fails=5 fail_timeout=30;
                server 172.20.127.53:5000 max_fails=5 fail_timeout=30;
+               server 172.20.84.227:5000 max_fails=5 fail_timeout=30;

        }

I0625 15:44:22.742475       6 controller.go:179] Backend successfully reloaded.
I0625 15:44:31.582142       6 controller.go:169] Configuration changes detected, backend reload required.
I0625 15:44:31.582167       6 util.go:68] rlimit.max=1048576
I0625 15:44:31.582173       6 nginx.go:522] Maximum number of open file descriptors: 523264
I0625 15:44:31.639561       6 nginx.go:629] NGINX configuration diff:
--- /etc/nginx/nginx.conf       2018-06-25 15:44:22.683427020 +0000
+++ /tmp/new-nginx-cfg243681809 2018-06-25 15:44:31.635449683 +0000
@@ -211,6 +211,7 @@

                keepalive 32;

+               server 172.20.116.105:5000 max_fails=5 fail_timeout=30;
                server 172.20.127.53:5000 max_fails=5 fail_timeout=30;
                server 172.20.84.227:5000 max_fails=5 fail_timeout=30;

I0625 15:44:31.678958       6 controller.go:179] Backend successfully reloaded.
54.172.193.230 - [54.172.193.230] - - [25/Jun/2018:15:44:39 +0000] "GET /sitemap/wales/dyfed/ceredigion-sir-ceredigion/maen-y-groes/ HTTP/1.1" 200 2953 "-" "MauiBot (crawler.feedback+wc@gmail.com)" 327 0.005 [apps-twigdoo-web] 172.20.127.53:5000 9123 0.004 200 daf90a3fae7caca6efaffbbaf6ed0755

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

fejta-bot commented 5 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /close

k8s-ci-robot commented 5 years ago

@fejta-bot: Closing this issue.

In response to [this](https://github.com/kubernetes/contrib/issues/2923#issuecomment-441092287): >Rotten issues close after 30d of inactivity. >Reopen the issue with `/reopen`. >Mark the issue as fresh with `/remove-lifecycle rotten`. > >Send feedback to sig-testing, kubernetes/test-infra and/or [fejta](https://github.com/fejta). >/close Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

kubernetes-retired / contrib

ingress controller reloading backend regularly (between 30s-3m) #2923