resque / resque-pool

quickly fork a pool resque workers, saving memory (w/REE) and monitoring their uptime.
MIT License
455 stars 152 forks source link

Wait for other pool workers to shut down before forking up new workers #139

Open nevans opened 9 years ago

nevans commented 9 years ago

Re: #132 and #137, the zero-downtime approach starts a new pool while the old pool is still running. Although this is usually fine, it can lead to issues in memory-constrained environments. Our default config loader should take workers from other pools (or orphaned workers with no pool) into account.

See and the discussion on #132. I prefer reading Resque.redis.smembers("workers") to looking at ps, but ps might be more fool-proof.

jchatel commented 8 years ago

Using Heroku & resque-pool, how do I signal USR2 so jobs complete but no new resque jobs are executed?

I'm happy (and prefer) to not process enqueued jobs with old code when I do a release but I don't see any help on that.

I would like to send USR2 signal to all my resque workers, wait until working jobs reach 0, then do my release and restart workers (resque-pool) to resume work. Just can't figure out how to do it or if I'm missing something obvious :/

mikz commented 8 years ago

Any plans to do a stable release of this? I'd say it is better to release with double memory usage, than no release at all. I see it as, but would be nice to polish things out and release a stable one. I'll be trying it over next few days.

jchatel commented 8 years ago

This is the work around I did:

I added a Resque "before_enqueue" hook. Inside, I check for a flag in redis to know if I paused the processing or not.

If the flag is set, then I simply push in the future using rescue_scheduler (using another flag for the amount of time to defer, but it's not really neeeded).

When I do a release, I stop the scheduler, set the flag to defer job and stop the workers when no work is done anymore.

At which point, I can happily release.

This is the code I have in my hook.

    # check for defer
    defer = $redis.get("resque_defer")
    if defer.present?
        defer_time = $redis.get("resque_defer_time")
        if defer_time.blank? || defer_time.to_i <= 0
            defer_time = 60
            defer_time = defer_time.to_i

        puts "[DEFFERED FROM before_enqueue]: #{options}"
        classtype = Object.const_get(options['class_type'])
        push_in(defer_time, classtype, options)
        return false
mikz commented 8 years ago

@jchatel I suppose the hot-swap does that? It signals the old cluster to stop working and starts new cluster, that will be processing new jobs.

mikz commented 8 years ago

I made some changes to my fork, to make it easier to debug.