Open o-shevchenko opened 6 months ago
Could you please try to reproduce this with v1.63.1 or v1.64.0? v1.63.0 contained a few bugs that were fixed in v1.64.0 and backported to v1.63.1: https://github.com/grpc/grpc-java/releases/tag/v1.63.1.
Thanks for the reply @sergiitk ! Yes, I can reproduce it with 1.63.1 version as well
Adding a shutdown hook that calls shutdown() and await termination() on GRPC server is the correct way to produce a graceful shutdown, as you have already elucidated. We have had discussions in the past on whether to provide this ability in the GRPC server but decided against it since we are a library, not a framework, and we don't control main.
Thanks, @kannanjgithub. But I'm not sure if you understand the issue from the description. We already invoked await termination(),
but it doesn't work as expected. It ignores open streams and just kills the server even if the client still reads data. We are forced to add additional logic to our shutdown hooks to check open streams for the server before invoking awaitTermination().
Could you comment if it's an expected behaviour?
Thanks!
Added more details @kannanjgithub :
awaitTermination()
, BUT the gRPC server is terminated immediately even if we still read data via stream.We find it surprising that awaitTermination could have stopped working since it works in the examples code. Can you provide a test setup and share the GCP project with us to help debug the issue?
I think I may know what's going on here. I think the last RPCs were cancelled (or deadline exceeded). gRPC then enqueued a callback to an executor and terminated because there were no more RPCs. But your application hasn't necessarily finished its processing in those callbacks.
The easiest way to solve this also follows a best-practice of providing a serverBuilder.executor()
to gRPC to run callbacks so that you can limit the maximum number of threads. If you pass your own ExecutorService, then after gRPC's awaitTermination() returns true
, you can wait for callbacks to complete.
// Just an example executor. gRPC uses Executor.newCachedThreadPool()
ExecutorService myExecutor = Executors.newFixedThreadPool(10);
Server server = ServerBuilder.forPort(blah)
...
.executor(myExecutor)
.build();
// In shutdown hook
server.shutdown();
server.awaitTermination(10, TimeUnit.SECONDS);
server.shutdownNow();
server.awaitTermination(10, TimeUnit.SECONDS);
// Now wait for all callbacks to complete. If you have server.awaitTermination()
// in your main(), you could do this there instead. It just needs to happen on a
// non-daemon thread.
myExecutor.shutdown();
myExecutor.awaitTermination(10, TimeUnit.SECONDS);
What version of gRPC-Java are you using?
1.63.0
What is your environment?
RHEL Docker image, JDK 17. We use https://github.com/grpc-ecosystem/grpc-spring, which uses
awaitTermination
to shut down the server gracefully.What did you expect to see?
gRPC server supports grateful shutdown if we have open streams. We use gRPC streaming to read and write data via our microservice. We expect that we can utilize K8 graceful shutdown to postpone the pod kill process to finish read/write first and close all streams to don't close the connection.
What did you see instead?
Even if we configured graceful shutdown for gRPC server and K8s pod we still see that gRPC server is terminating immediately after SIGTERM even if we invoke awaitTermination().
Steps to reproduce the bug
kubectl delete pod
or executekill -TERM PID
for Java process inside your pod (it should have PID 1 if you started your Java app as the main process)See issue: https://github.com/grpc-ecosystem/grpc-spring/issues/1110 See a similar problem described here: https://fedor.medium.com/shutting-down-grpc-services-gracefully-961a95b08f8