Closed chibenwa closed 3 weeks ago
It is not impossible that ignoring such exceptions might be OK
This java.util.concurrent.RejectedExecutionException: event executor terminated
is acceptable IMO as it was likely a race condition during the graceful shutdown process.
From ChatGPT (of course, we should not believe it 100% but I think it makes sense for our case):
The `RejectedExecutionException` during the shutdown process is a common issue when tasks are submitted after the event executors have begun shutting down.
If there are multiple threads involved in your application and some tasks are submitted concurrently with the shutdown process, it's possible for a task to be submitted after shutdownGracefully() has been called but before the event executor has completed its shutdown sequence.
Also the EventExecutorGroup.shutdownGracefully()
has a timeout to shutdown (https://netty.io/4.1/api/io/netty/util/concurrent/EventExecutorGroup.html#shutdownGracefully-long-long-java.util.concurrent.TimeUnit-):
* @param timeout the maximum amount of time to wait until the executor is {@linkplain #shutdown()}
* regardless if a task was submitted during the quiet period
IMO the EventExecutorGroup
received a few more tasks during the quiet period, it tried to process them but when the timeout is reached, the EventExecutorGroup
is forced to be shutdown therefore the remaining requests are rejected with the above error because the EventExecutorGroup
is not running anymore.
For short: likely the Netty executor can not be more graceful (after a timeout) and decided to reject the pending request. No big deal.
To be fair I am unsure what to do: if the eventLoop is shut down first then the executor cannot send replies back to the client. if the executor is shut down first then the event loop might receive network requests it cannot handle.
I can not search for the best practices for this case (not many materials available).
IMO shutting the executor first would be better: the event loop might receive network requests it cannot handle
-> the client is aware of the failure (connection close?) and can retry later. This sounds fine to me.
If we shut down the event loop first: the executor cannot send replies back to the client
-> the client is not aware of the actual result and may attempt a retry (even if the executor finished the request), then the client may send 2 duplicated mails for example. This sounds misleading to me.
WDYT?
IMO shutting the executor first would be better
This is what is done today BTW
Thanks for getting a look.
Maybe we could...
This is what is done today BTW
Actually, I found that we are shutting down the event loop first. https://github.com/apache/james-project/pull/2313
or even upon shutdown have BasicChannelInboundHandler silent the exception (in which case BasicChannelInboundHandler needs to be aware of the shutdown).
Please if this is what you want ^^. Actually that throw exception behavior is configurable.
Caused by https://github.com/apache/james-project/pull/2167
To be fair I am unsure what to do:
if the executor is shut down first then the event loop might receive network requests it cannot handle.
Experiment a bit and see what the Netty best practices are. It is not impossible that ignoring such exceptions might be OK, as the executor tasks itself completes, preserving the overall consistency.
DOD determine the best course of action after experimenting a bit / reviewing the litterature on the topic.