Closed aberres closed 2 years ago
Oh maybe just moving running = True
above the try_unblock_signals
will fix this
https://github.com/Bogdanp/dramatiq/blob/75579439172ea126efb4f83c753b0b513afb03ad/dramatiq/cli.py#L408-L412
Or probably moving running = True
to above the definition of term_handler
makes more sense (e.g., to line 390)
While @FinnLidbetter 's change seems appropriate, I wonder if you could give us more details about your setup @aberres. What OS are you on, and how are you stopping the process? Does it always happen? The window in which this can happen at the moment (given the current implementation) seems extremely narrow, and I'm unable to reproduce the issue.
Yes, sure.
The environment this happens on are Ubuntu Docker containers running on GKE (Google Kubernetes Engine). As we shut down/restart worker processes depending on available workload and additionally spin up complete environments for every branch in CI we got quite some shutdowns and restarts of workers. My gut feeling is that we are not seeing this issue most of the time.
I had another look at the logs and I'm a bit puzzled. It seems as if this message is not thrown during shutdown, but during startup.
If I read the logs correctly, sometimes SIGTERM is triggered pretty much instantly.
Sorry for the screenshots, but getting human readable text logs out of the Google Logs Explorer does not seem to be straight forward.
Note the second line, right after starting:
Some time later (we are waiting for a database instance to come up) the startup process continues as expected:
And then boom
If I read the logs correctly, sometimes SIGTERM is triggered pretty much instantly.
OK, that's what I suspected. In that case, Finn's change should fix this problem. Thanks for the details!
What version of Dramatiq are you using?
1.21.1
What did you do?
Since the update from 1.12.0 to 1.12.1 we are seeing the following errors:
I could not yet investigate further, but maybe it is related to 8da51575ef402c91069f48cf2a524075cd3ffb8c?