rails / solid_queue

Database-backed Active Job backend
MIT License
1.84k stars 109 forks source link

Readiness status #179

Closed andyjeffries closed 4 weeks ago

andyjeffries commented 6 months ago

Is there any way of detecting (from outside of the process) when the worker process is up and running and able to start processing jobs?

During deployment on Kubernetes a ReadinessProbe will not shut down the old worker until the new one is ready, but how can we determine that from outside?

Can we make it write to a file when it's up and ready to process jobs?

rosa commented 6 months ago

Hey @andyjeffries, sorry for the delay replying to this one! I wonder if you could use the pidfile that the supervisor sets as your readiness probe. See the supervisor_pidfile configuration option mentioned here. Initially, that was how we were going to run Solid Queue in k8s in the cloud but we never got to do that because we moved to on-premises using Kamal, so our setup is much simpler. The pidfile is setup right before the workers are started by the supervisor, so strictly they aren't ready to process jobs yet, but there's also a shutdown timeout that the previous supervisor will wait until actually terminating all workers, so I think it should be ok.

andyjeffries commented 6 months ago

Thanks @rosa , that should work OK for now. Would be great if there was a hook that could be executed for when it's up and running ready to process jobs...

morgoth commented 4 months ago

@rosa How do you define healthcheck for Kamal? I'm wondering how to do it in most reliable way especially with the recent change https://github.com/basecamp/kamal/pull/740

rosa commented 4 months ago

@morgoth you could use Kamal's healthcheck option and the supervisor_pidfile option, and just check for the file's presence. This only works for deploys, though, the healthcheck will do nothing once the deploy has finished. If you meant a health check more in the sense of continuous monitoring, we don't have anything like that. So far we've been fine just monitoring other things like number of pending jobs in specific queues.

In any case, I'm planning to improve the existing supervisor_pidfile approach so that we can detect that all workers are ready to process jobs/all dispatchers ready to dispatch, for version 1.0.

zerobearing2 commented 3 months ago

Maybe provide hooks into the SolidQueue boot/shutdown callbacks? This would allow custom logic for readiness detection (vs using the supervisor pid). We used this exact approach with Sidekiq on k8s, and it worked well enough.

andyjeffries commented 3 months ago

We used the approach @zerobearing2 describes too and it worked perfectly for us.

rosa commented 1 month ago

Hey everyone, thanks a lot for your patience here! I got a simple implementation for lifecycle hooks, just on start and stop over in #317. Do you think this could work for you? You'd need to register a block with SolidQueue.on_start that'd be called right before forking the other processes and use it like you use Sidekiq's on(:startup).

Let me know if I missed anything!

andyjeffries commented 1 month ago

Absolutely perfect for me. We'd just create a file and remove the file in each hook, and we'd check for the presence of that file to determine readiness (it's simplistic, but enough to know if the process boots slowly when we can shut down the old ones). Many thanks @rosa !

rosa commented 4 weeks ago

I am going to close this one, as I have already shipped these hooks. Thanks everyone for the input 🙏