Closed aupadh12 closed 2 years ago
Hello,
I also noticed that whenever I try to access the shinyproxy application, skipper ingress pod restarts with the error:
Readiness probe failed: Get "http://100.64.144.197:9999/kube-system/healthz": read tcp 100.64.237.71:37248->100.64.144.197:9999: read: connection reset by peer
Hi,
I also noticed that below the mount is using kube-api-access instead of skipper ingress secret.
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-k9rd4 (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-k9rd4:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
I am not sure if this is an issue since its different and the place where the there is no issue actually has skipper ingress secret getting mounted. although I am not sure what is missing since the setup is almost the same.
Hi
This is a surprising issue. I just did some test with k8s 1.21, ingress 1.1.2 and I don't experience the issue you are describing. You mention the skipper pod gets killed because it goes out of memory (OOMKILLED). Typically this can have two reasons:
It would be good to know the reason of the current issue (i.e. 1 or 2). However, since it seems you can easily reproduce the issue, I guess it will be reason 1. In that case you should be able to see that the process in the pod tries to allocate the memory in your monitoring (i.e. Prometheus). Can you verify this? If this is the case I would try to launch the daemonset with a bigger memory limit and see if the issue comes back.
Finally, you could try running the latest version of Skipper. Maybe they already solved a similar issue. (https://github.com/zalando/skipper/releases)
Hello,
I upgraded my nodetype to m5.8xlarge and I still see this to be an issue. Strangely, on K8s version 1.20 and ingress 1.0.0, this is working fine even on m5.4xlarge nodetype. So, I am not sure this is related to the more memory. I will try using latest version then.
Hi
So indeed this is not expected behavior, therefore I think we should try to find the cause. I think the node type should not be so important here, but rather the memory limit on the pod itself. Did you try to increase that as well? Basically, I want to try out whether Skipper just needs a bit more memory and it will work fine or whether skipper contains a bug where it keeps allocating memory. E.g. if you set the limit to 1G, 2G, 3G and it still crashes, it's definitely a bug in skipper (which may or may not be already fixed).
Hi @LEDfan ,
I was able to find the issue and fix the problem. Skipper ingress still does not support the new syntax of specifying ingressclassName. I have raised a feature request https://github.com/zalando/skipper/issues/1976 for including it. For now, I am using annotations for ingressclass on K8s version 1.21.5 and its working fine.
Hello,
so there is a workaround to use the latest syntax of ingress. The same mention here: https://github.com/zalando/skipper/issues/1976
Hi Team,
I have recently upgraded my nginx ingress controller to 1.1.2 and this fixes the validation webhook hook. But now, when I am trying to open the application, I notice that skipper ingress gets terminated with OOMKILLED reason and endpoint is also deleted giving nginx 503/504 error. After sometime, I notice that the skipper starts running again. But again when I try to access my application, same events mentioned above are repeated. This was not happening when I was running old nginx ingress controller. Can you please let me know what is going on here?
Below are the logs of skipper ingress:
[APP]time="2022-03-05T11:03:09Z" level=info msg="Expose metrics in codahale format" [APP]time="2022-03-05T11:03:09Z" level=info msg="register sigterm handler" [APP]time="2022-03-05T11:03:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token" [APP]time="2022-03-05T11:03:09Z" level=info msg="Run secrets path refresher every 1m0s, but update once first" [APP]time="2022-03-05T11:03:09Z" level=info msg="debug listener on :9922" [APP]time="2022-03-05T11:03:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token" [APP]time="2022-03-05T11:03:09Z" level=info msg="support listener on :9911" [APP]time="2022-03-05T11:03:09Z" level=warning msg="RouteGroups CRD is not installed in the cluster.\nSee: https://opensource.zalando.com/skipper/kubernetes/routegroups/#installation" [APP]time="2022-03-05T11:03:09Z" level=info msg="route settings, reset, route: kube_argocdargocd_app_itx_age_serverargocd_itx_age_xyz_com_argocd_app_itx_age_server: Host(/^argocd-itx-age[.]xyz[.]com$/) && PathSubtree(\"/\") -> \"http://100.64.179.131:8080\"" [APP]time="2022-03-05T11:03:09Z" level=info msg="route settings, reset, route: kube_kube_systemngingx_to_skipper_ingressshinyproxy_itx_age_xyz_com_skipper_ingress: Host(/^shinyproxy-itx-age[.]xyz[.]com$/) && PathSubtree(\"/\") -> status(502) -> inlineContent(\"no endpoints\") ->"
[APP]time="2022-03-05T11:03:09Z" level=info msg="route settings, reset, route: kube_shinyproxy sp_shinyproxy_hpc_ing_b3cf234ea2fdeff89bf8a673d5766b91d355ebb6shinyproxy_itx_age_xyz_com__sp_shinyproxy_hpc_svc_b3cf234ea2fdeff89bf8a673d5766b91d355ebb6: Host(/^shinyproxy-itx-age[.]xyz[.]com$/) && True() -> setRequestHeader(\"X-ShinyProxy-Instance\", \"b3cf234ea2fdeff89bf8a673d5766b91d355ebb6\") -> setRequestHeader(\"X-ShinyProxy-Latest-Instance\", \"b3cf234ea2fdeff89bf8a673d5766b91d355ebb6\") -> appendResponseHeader(\"Set-Cookie\", \"sp-instance=b3cf234ea2fdeff89bf8a673d5766b91d355ebb6; Secure; Path=/\") -> appendResponseHeader(\"Set-Cookie\", \"sp-latest-instance=b3cf234ea2fdeff89bf8a673d5766b91d355ebb6; Secure; Path=/\") -> \"http://100.64.242.101:8080\""
[APP]time="2022-03-05T11:03:09Z" level=info msg="route settings, reset, route: kubehealthz: Path(\"/kube-system/healthz\") && SourceFromLast(\"10.0.0.0/8\", \"192.168.0.0/16\", \"172.16.0.0/12\", \"127.0.0.1/8\", \"fd00::/8\", \"::1/128\", \"100.0.0.0/8\") -> status(200) -> disableAccessLog(200) -> "
[APP]time="2022-03-05T11:03:09Z" level=info msg="route settings received"
[APP]time="2022-03-05T11:03:09Z" level=info msg="route settings applied"
[APP]time="2022-03-05T11:03:09Z" level=info msg="proxy listener on :9999"
[APP]time="2022-03-05T11:03:09Z" level=info msg="TLS settings not found, defaulting to HTTP"
[APP]time="2022-03-05T11:04:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token"
[APP]time="2022-03-05T11:04:13Z" level=info msg="diff taken, inserts/updates: 1, deletes: 0"
[APP]time="2022-03-05T11:04:13Z" level=info msg="route settings, update, route: kube_kube_system ngingx_to_skipper_ingressshinyproxy_itx_age_xyz_com_____skipper_ingress: Host(/^shinyproxy-itx-age[.]xyz[.]com$/) && PathSubtree(\"/\") -> \"http://100.64.146.32:9999\""
[APP]time="2022-03-05T11:04:13Z" level=info msg="route settings received"
[APP]time="2022-03-05T11:04:13Z" level=info msg="route settings applied"
[APP]time="2022-03-05T11:05:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token"
[APP]time="2022-03-05T11:06:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token"
[APP]time="2022-03-05T11:07:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token"
[APP]time="2022-03-05T11:08:09Z" level=info msg="Updated secret file: /var/run/secrets/kubernetes.io/serviceaccount/token"
The logs from nginx ingress controller : 2022/03/05 11:00:34 [error] 4912#4912: 470546 upstream prematurely closed connection while reading response header from upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" 2022/03/05 11:00:34 [warn] 4912#4912: 470546 [lua] sticky.lua:189: balance(): failed to get new upstream; using upstream nil while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" 2022/03/05 11:00:34 [warn] 4912#4912: 470546 [lua] balancer.lua:335: balance(): no peer was returned, balancer: sticky_balanced while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" W0305 11:00:35.201227 6 controller.go:1094] Service "kube-system/skipper-ingress" does not have any active Endpoint. 2022/03/05 11:00:39 [warn] 4912#4912: 470546 upstream server temporarily disabled while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://0.0.0.1:80/", host: "shinyproxy-itx-age.xyz.com" 2022/03/05 11:00:39 [error] 4912#4912: 470546 upstream timed out (110: Operation timed out) while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://0.0.0.1:80/", host: "shinyproxy-itx-age.xyz.com" 100.64.171.163 - - [05/Mar/2022:11:00:39 +0000] "GET / HTTP/2.0" 504 562 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 1076 29.094 [kube-system-skipper-ingress-http] [] 100.64.146.32:9999, 0.0.0.1:80 0, 0 24.096, 5.000 502, 504 dfcedb8379d93581c0e841fa579c6a2b 100.64.171.163 - - [05/Mar/2022:11:01:32 +0000] "GET / HTTP/2.0" 503 592 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 90 0.000 [kube-system-skipper-ingress-http] [] - - - - 6204288c167241f54933c2928bbf487c 2022/03/05 11:02:54 [error] 4912#4912: 470546 upstream prematurely closed connection while reading response header from upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" 2022/03/05 11:02:54 [error] 4912#4912: 470546 connect() failed (111: Connection refused) while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" 2022/03/05 11:02:54 [error] 4912#4912: 470546 connect() failed (111: Connection refused) while connecting to upstream, client: 100.64.171.163, server: shinyproxy-itx-age.xyz.com, request: "GET / HTTP/2.0", upstream: "http://100.64.146.32:9999/", host: "shinyproxy-itx-age.xyz.com" 100.64.171.163 - - [05/Mar/2022:11:02:54 +0000] "GET / HTTP/2.0" 502 552 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36" 27 23.528 [kube-system-skipper-ingress-http] [] 100.64.146.32:9999, 100.64.146.32:9999, 100.64.146.32:9999 0, 0, 0 23.528, 0.000, 0.000 502, 502, 502 76d0d4fdbea222a7d553bfbbfb9cb855 W0305 11:02:55.288677 6 controller.go:1094] Service "kube-system/skipper-ingress" does not have any active Endpoint.
Also, my skipper ingress files are as below:
ingress.yaml
skipper-service.yaml
skipper daemonset.yaml