collectiveidea / delayed_job

Database based asynchronous priority queue system -- Extracted from Shopify
http://groups.google.com/group/delayed_job
MIT License
4.81k stars 955 forks source link

delayed_job script and latest daemons gem #1172

Open stuzart opened 2 years ago

stuzart commented 2 years ago

Following an upgrade to Rails 6, I started having problems being unable to stop workers running using the stop command. Basically, it didn't stop the workers. ./script/delayed_job status also reported nothing was running .

I think I've tracked this down to a change in the daemons gem, which had also been upgraded (to 1.4.1). The interface has changed slightly and doesn't recognise the pid files without providing an additional pid delimiter being provided (https://github.com/thuehlinger/daemons/blob/master/lib/daemons/pidfile.rb#L34). Maybe this is also the cause of the stop and status commands not finding the pids.

Rolling back the version (to 1.1.9 which I'd been using previously) fixed things for me.

The docs recommend installing the daemons gem, but doesn't specify a version

stuzart commented 2 years ago

just to add, I start the workers with individual start commands, passing the queue name and index with -i. I am able to stop and query the individual workers by including -i with the stop and status commands .

I also notice, when I run delayed_job with no arguments, I am shown an option (different to the help with -h), that allows pid_delimiter to be passed, but I cannot get this to be recognised as part of the full command.

cat5inthecradle commented 3 months ago

I can reproduce this with delayed_job 4.1.11 and daemons 1.4.1.

EDIT: Still experiencing the problem below, but it is not resolved by reverting to daemons 1.1.9. Only stop and status are fixed by rolling back.


Additionally, and even more painfully, is a problem with restart

delayed_job -n 2 restart    # Starts 2 workers
delayed_job -n 2 restart    # Stops both workers correctly, and starts two new ones
delayed_job -n 4 restart    # Stops both workers correctly, and starts four new ones

delayed_job -n 2 restart    # Stops ONLY TWO workers, and starts two new ones, leaving two old ones running

We have been using restart when we deploy new code, and this is leaving workers with old code running.