Closed ramkumarg1 closed 3 years ago
Hi @ramkumarg1
When shinyproxy receives as SIGTERM signal (when the deployment is scaled down), it should gracefully terminate by stopping all application pods first. You may have to increase the grace period terminationGracePeriodSeconds
in the pod spec (default is 30s). If shinyproxy is unable to terminate within this period, it will receive a SIGKILL and be terminated immediately, leaving behind orphan pods. More info here: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
Thanks @dseynaev I changed the deployment spec to include terminationGracePeriodSeconds - but it didnt' make a difference. The pod was killed immediately - Perhaps, this issue is linked to https://github.com/kubernetes/kubernetes/issues/47576 where spring boot needs to handle SIGTERM gracefully?
spec:
terminationGracePeriodSeconds : 180
containers:
- name: shinyproxy
We observe the same issue with zombie pods, and for us the termination grace period setting also does not resolve this.
I have the same issue and this is what is logged by shiny/containerproxy upon termination:
2020-01-30 10:56:56.785 INFO 1 --- [ main] e.o.c.ContainerProxyApplication : Started ContainerProxyApplication in 39.115 seconds (JVM running for 43.619)
2020-01-30 10:57:01.374 INFO 1 --- [ XNIO-2 task-1] io.undertow.servlet : Initializing Spring FrameworkServlet 'dispatcherServlet'
2020-01-30 10:57:01.375 INFO 1 --- [ XNIO-2 task-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization started
2020-01-30 10:57:01.507 INFO 1 --- [ XNIO-2 task-1] o.s.web.servlet.DispatcherServlet : FrameworkServlet 'dispatcherServlet': initialization completed in 131 ms
2020-01-30 10:57:26.275 INFO 1 --- [ XNIO-2 task-16] e.o.containerproxy.service.UserService : User logged in [user: **]
2020-01-30 10:57:35.802 INFO 1 --- [ XNIO-2 task-3] e.o.containerproxy.service.ProxyService : Proxy activated [user: ***] [spec: insight] [id: 9274ad33-665a-4d47-bab5-6c4b39a618b8]
2020-01-30 10:59:02.376 INFO 1 --- [ Thread-2] ConfigServletWebServerApplicationContext : Closing org.springframework.boot.web.servlet.context.AnnotationConfigServletWebServerApplicationContext@2b2948e2: startup date [Thu Jan 30 10:56:24 GMT 2020]; root of context hierarchy
2020-01-30 10:59:02.377 ERROR 1 --- [pool-4-thread-1] java.io.InputStreamReader : Error while pumping stream.
java.io.EOFException: null
at okio.RealBufferedSource.require(RealBufferedSource.java:61) ~[okio-1.15.0.jar!/:na]
at okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:303) ~[okio-1.15.0.jar!/:na]
at okhttp3.internal.http1.Http1Codec$ChunkedSource.readChunkSize(Http1Codec.java:469) ~[okhttp-3.12.0.jar!/:na]
at okhttp3.internal.http1.Http1Codec$ChunkedSource.read(Http1Codec.java:449) ~[okhttp-3.12.0.jar!/:na]
at okio.RealBufferedSource$1.read(RealBufferedSource.java:439) ~[okio-1.15.0.jar!/:na]
at java.io.InputStream.read(InputStream.java:101) ~[na:1.8.0_171]
at io.fabric8.kubernetes.client.utils.BlockingInputStreamPumper.run(BlockingInputStreamPumper.java:49) ~[kubernetes-client-4.2.2.jar!/:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_171]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_171]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_171]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]
2020-01-30 10:59:02.394 INFO 1 --- [ Thread-2] o.s.j.e.a.AnnotationMBeanExporter : Unregistering JMX-exposed beans on shutdown
2020-01-30 10:59:02.403 INFO 1 --- [ Thread-2] o.s.j.e.a.AnnotationMBeanExporter : Unregistering JMX-exposed beans
2020-01-30 10:59:02.514 WARN 1 --- [ Thread-2] .s.c.a.CommonAnnotationBeanPostProcessor : Invocation of destroy method failed on bean with name 'proxyService': eu.openanalytics.containerproxy.ContainerProxyException: Failed to stop container
2020-01-30 10:59:02.525 INFO 1 --- [ Thread-2] io.undertow.servlet : Destroying Spring FrameworkServlet 'dispatcherServlet'
I found a solution for this issue. This is not actually a problem in shinyproxy or containerproxy as the Spring Boot app is correctly and gracefully shut down.
The problem is the kubctl proxy
sidecar container. For Kubernetes it is not clear that containerproxy relies on the sidecar container to communicate with Kubernetes itself. So, on a new deployment Kubernetes will send SIGTERM to both the proxy and the sidecar container in all the old pods. The sidecar container will terminate immediately and containerproxy fails to communicate with Kubernetes.
I read that Kubernetes is about to solve these startup and shutdown dependencies in v1.18 as documented here: https://github.com/kubernetes/enhancements/issues/753 https://banzaicloud.com/blog/k8s-sidecars/
Until then there is a simple workaround to put the following lifecycle annotation to the sidecar container:
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 5"] # wait 5 seconds to let shinyproxy remove the pods on graceful shutdown
I can confirm @fmannhardt's fix resolves this. Thank you so much!
Hi all
With recent versions of ShinyProxy (I'm not sure which version exactly, but at least ShinyProxy 2.3.1) there is no need to use a kube-proxy sidecar. ShinyProxy automatically detects the location and authentication of the Kubernetes API. Therefore I think this problem is automatically solved. Nevertheless, thank you for your time and investigation!
Hi, when there is a change in application.yaml and the rolling update is chosen (with replicas set to 0 and then back to 1) - mainly because the new shinyproxy image needs to be downloaded from the artifactory - All the earlier pods that were spun up by the previous shinyproxy get left behind as zombie's
To reproduce:
NAME READY STATUS RESTARTS AGE pod/shinyproxy-7f76d48c79-8x9hs 2/2 Running 0 41m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 40m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/shinyproxy 1 1 1 1 41m
NAME DESIRED CURRENT READY AGE replicaset.apps/shinyproxy-7f76d48c79 1 1 1 41m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
NAME READY STATUS RESTARTS AGE pod/shinyproxy-7f76d48c79-8x9hs 2/2 Running 0 43m pod/sp-pod-e7603441-03ba-470b-925a-22cfba1716de 1/1 Running 0 12s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 43m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/shinyproxy 1 1 1 1 43m
NAME DESIRED CURRENT READY AGE replicaset.apps/shinyproxy-7f76d48c79 1 1 1 43m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
kubectl scale --replicas=0 deployment/shinyproxy deployment.extensions/shinyproxy scaled
kubectl scale --replicas=1 deployment/shinyproxy deployment.extensions/shinyproxy scaled
NAME READY STATUS RESTARTS AGE pod/shinyproxy-7f76d48c79-l5fvw 0/2 ContainerCreating 0 4s pod/sp-pod-e7603441-03ba-470b-925a-22cfba1716de 1/1 Running 0 1m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/shinyproxy NodePort 172.30.85.191 8080:32094/TCP 44m
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/shinyproxy 1 1 1 0 45m
NAME DESIRED CURRENT READY AGE replicaset.apps/shinyproxy-7f76d48c79 1 1 0 45m
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD route.route.openshift.io/shinyproxy shinyproxy-aap.apps.cpaas.service.test shinyproxy None
At this stage my web-application is irresponsive - the only thing to do is to close the tab/window. And the pod (for the R application) continues to stay unless manually deleted.
The pod which is consuming resources is not accessible, because the new service points to the updated deployment and application can be only accessed through a route over the service
It also is very difficult to identify which of the pods are the stale ones and delete manually