Wait for other pool workers to shut down before forking up new workers

resque / resque-pool

quickly fork a pool resque workers, saving memory (w/REE) and monitoring their uptime.

http://rubygems.org/gems/resque-pool

MIT License

455 stars 152 forks source link

Wait for other pool workers to shut down before forking up new workers #139

Open nevans opened 9 years ago

nevans commented 9 years ago

Re: #132 and #137, the zero-downtime approach starts a new pool while the old pool is still running. Although this is usually fine, it can lead to issues in memory-constrained environments. Our default config loader should take workers from other pools (or orphaned workers with no pool) into account.

See https://github.com/backupify/resque-pool/compare/nevans:master and the discussion on #132. I prefer reading Resque.redis.smembers("workers") to looking at ps, but ps might be more fool-proof.

jchatel commented 8 years ago

Using Heroku & resque-pool, how do I signal USR2 so jobs complete but no new resque jobs are executed?

I'm happy (and prefer) to not process enqueued jobs with old code when I do a release but I don't see any help on that.

I would like to send USR2 signal to all my resque workers, wait until working jobs reach 0, then do my release and restart workers (resque-pool) to resume work. Just can't figure out how to do it or if I'm missing something obvious :/

mikz commented 8 years ago

Any plans to do a stable release of this? I'd say it is better to release with double memory usage, than no release at all. I see it as 0.7.dev, but would be nice to polish things out and release a stable one. I'll be trying it over next few days.

jchatel commented 8 years ago

This is the work around I did:

I added a Resque "before_enqueue" hook. Inside, I check for a flag in redis to know if I paused the processing or not.

If the flag is set, then I simply push in the future using rescue_scheduler (using another flag for the amount of time to defer, but it's not really neeeded).

When I do a release, I stop the scheduler, set the flag to defer job and stop the workers when no work is done anymore.

At which point, I can happily release.

This is the code I have in my hook.

    # check for defer
    defer = $redis.get("resque_defer")
    if defer.present?
        defer_time = $redis.get("resque_defer_time")
        if defer_time.blank? || defer_time.to_i <= 0
            defer_time = 60
        else
            defer_time = defer_time.to_i
        end

        puts "[DEFFERED FROM before_enqueue]: #{options}"
        classtype = Object.const_get(options['class_type'])
        push_in(defer_time, classtype, options)
        return false
    end

mikz commented 8 years ago

@jchatel I suppose the hot-swap does that? It signals the old cluster to stop working and starts new cluster, that will be processing new jobs.

mikz commented 8 years ago

I made some changes to my fork, to make it easier to debug.

daemonize later! so it quits in the console if redis does not work for example
prints errors to the console if there are problems during boot
ping redis before starting workers (so there is single error, rather than several)
use Process.daemon

https://github.com/nevans/resque-pool/compare/master...3scale:master