Closed jgaskins closed 11 months ago
Yes, you're right Runner#stop will do what you want.
The implementation of this isn't quite universal -- it depends on how you're booting up your worker. The demo script has this code in it:
Signal::INT.trap do
Mosquito::Runner.stop
end
Mosquito::Runner.start(spin: false)
You could certainly trap Signal::QUIT and TERM instead or in addition. However, Mosquito::Runner.stop is not captive. I don't think that should be a problem -- though it may mean you need to implement some sort of spin lock on your own to wait for shutdown.
it may mean you need to implement some sort of spin lock on your own to wait for shutdown.
This is one of the reasons I posted this issue, actually. If I start the runner with runner.start(spin: false)
and stop it with runner.stop
, how do I know when that runner is done? I don't see a way to inspect its running state.
Interesting, I see what you mean.
The start(spin: false)
interface doesn't really please me, but I don't know what I should replace it with... and I think the lack of pattern to mimic leaves me without good vision for what stop should look like. What would you want to do here? The demo script I linked above spins around a check on keep_running
, but that is also not well named anymore because the runner has more granularity than simply running and not-running.
Do you have a suggestion of an interface that would accomplish what you're thinking? Can you share what you are currently doing to work-around the lack of functionality?
I've been trying to come up with something for this for the past couple days. I feel like the default functionality with start
is a solid interface (start
blocks until it's finished), and maybe both start
and stop
could block until the runner exits so that regardless of how you structure things it'll just work, but I don't know if the juice is worth the squeeze on that.
I agree that spin: false
isn't ideal, though. Since blocking operations in Crystal can be moved to the background with spawn
, it may not actually be necessary to solve in Mosquito.
I have a strong preference for a batteries-included type interface with mosquito, even though it's not all there yet.
I like the idea of a blocking stop command, but I'd probably make it optional as with start. stop(wait: true)
would wait for the shutdown before exiting and stop()
return immediately.
Regardless, the runner's notion of state needs to be more robust so it can handle at least: running, shutting down, stopped.
I have a strong preference for a batteries-included type interface with mosquito
:100: For folks who came to Crystal from languages where the convention is to load code at runtime using a CLI provided by a framework, having to write our own entrypoint into the background-job runner which loads all of our jobs is new, so reducing that cognitive load as much as is feasible goes a long way.
@jgaskins 51904a05674757410268d183dd78fb2259ddbad7 is merged and includes improvements to the Runner interface. You can now call Mosquito::Runner.stop(wait: true)
and it will not return until it's finished working.
You can then modify your worker.cr with a signal handler which will respond to SIGINT or whatever is right for your deployment.
See the Runner docs for details.
Kubernetes terminates processes in running containers using a
TERM
signal, and then afterterminationGracePeriodSeconds
it sends aKILL
signal to forcefully shut it down.My Mosquito deployment is staying active and processing jobs until it receives that
KILL
, which is common and I also have to work around this in my web apps by closing theHTTP::Server
instance. It looks like we would useMosquito::Runner.stop
here, but we don't want to exit immediately because that can leave jobs in a partially processed state.Instead, if
Mosquito::Runner.stop
blocks (optionally?) until all currently executing jobs are complete, that should let them shut down gracefully but quickly.