Closed o-shevchenko closed 1 month ago
Thanks I haven't tried it yet. I will test it with K8s and let you know the result
Looks like it works. At least now I see that K8s can't kill it for configured period of time.
Additionaly to shutdownGracePeriod =-1
I configured terminationGracePeriodSeconds
for 24h (just for testing).
I also tried to adjust various confs:
grpc:
server:
port: 6565
reflection-service-enabled: true
shutdown-grace-period: -1
enable-keep-alive: true
keep-alive-time: 86400
keep-alive-timeout: 86400
permit-keep-alive-without-calls: true
permit-keep-alive-time: 86400
But after 5 minutes the app is getting killed anyway. I can't find a conf that is responsible for that.
[SpringApplicationShutdownHook] [trace_id=, span_id=]n.d.b.g.s.s.GrpcServerLifecycle : Completed gRPC server shutdown
Looks like it's Spring conf. I will try to experiment with it more
You could add a log line/debug break point here:
To check if the waiting gets interrupted somehow.
Thanks, I'm already looking into such a logic. It's not easy to debug everything with K8s. I will try to add more logs by DEBUG or use Telepresence or something to understand why the service is getting killed after 5 minutes.
Depending on your setup debugging in K8s is easy. Just expose an additional port or tunnel/port-forward(?) into the container and then connect as usual.
The connection is closed from the K8 side. When I run the server without K8s and send kill -TERM
to the Java process, it waits to close all connections properly. For k8s, the connection is closed, service shut down, and k8s kill container. Need to check ingress timeouts. Or maybe I need to adjust keep-alive
confs as well
Thanks for the update
When running a Java process inside a Docker container, sending a SIGTERM signal (kill -TERM 1) results in immediate termination rather than a graceful shutdown. This issue does not occur when running the same Java process locally and do the same kill.
kubectl exec -it pod_id -- /bin/bash
kill -TERM 1
Locally, it works fine but when server is inside Docker container graceful shutdown doesn't work and I can't understand why localServer.awaitTermination();
immediately kills the server. I don't see any InterruptedException
when I connect via 5005.
I'm running out of ideas. Do you have any ideas on further investigation or narrowing down the scope? Thanks!
Sorry, unfortunately not.
I think localServer.awaitTermination();
doesn't work as I expect :( . I'm working on implementing a custom ShutDown Hook to check the number of active streams before terminating.
This article describes similar problem https://fedor.medium.com/shutting-down-grpc-services-gracefully-961a95b08f8
Just FYI. Thanks for the help
Maybe also create an issue upstream in grpc-java and link it here. Maybe they can add a build in variant as well because I cannot imagine that you are the only one having this problem.
Yes, I expected it should already be handled downstream. Creating an issue on grpc-java is a good thing.
I've created an issue: https://github.com/grpc/grpc-java/issues/11229
The context We deploy our service in K8s and provide a gRPC streaming API so the server can hold open connections for a period of time. We need to have a CD to redeploy the new version of the service, but we want to prevent K8s from killing our service if there is an open GRPC stream.
The question Do we have a support for graceful shutdown of the service only when we don't have open connections? I see this: https://github.com/grpc-ecosystem/grpc-spring/blob/master/grpc-server-spring-boot-starter/src/main/java/net/devh/boot/grpc/server/serverfactory/GrpcServerLifecycle.java#L58 But I don't see we check the state of the service itself