Refactor internals to use concurrent-ruby for v2

brandonhilkert / sucker_punch

Sucker Punch is a Ruby asynchronous processing library using concurrent-ruby, heavily influenced by Sidekiq and girl_friday.

MIT License

2.64k stars 114 forks source link

Refactor internals to use concurrent-ruby for v2 #145

Closed brandonhilkert closed 8 years ago

brandonhilkert commented 8 years ago

The bulk of the work has been done. In order to release v2.0, the following needs to be addressed:

[x] Submit PR to Rails to restrict sucker_punch versioning to pre-v2 (merged)
[x] Publish blog post about upcoming release and API changes (Week of Jan. 4th)
[x] Potentially address shutdown modes (in progress)
[x] Add post install hooks to warn about API changes, specifically perform_async and perform_in
[x] Review documentation
[x] Document async_syntax include and method in post install message
[ ] Add additional testing to queue.rb
[x] Remove configuration of shutdown_handler and remove documentation
[x] Add configuration of shutdown_timeout and document
[x] Sort out correct shutdown behavior
[x] Release beta

jdantonio commented 8 years ago

So it turns out that I unintentionally lied to you over here. My explanation of the#shutdown method was incorrect. I was basing my explanation on my original Ruby implementation of thread pool. Later another team member looked at the internals of Java's ThreadPoolExecutor and learned that my implementation was not in line with that so he then refactored our Ruby thread pool. So, let me try this again... The #shutdown method does allow all jobs in the queue to be finished before the pool is shutdown. The #kill method, as I stated earlier, brutally kills all the threads in the thread pool. The scenario that is not covered is to let all running tasks complete but do not process any jobs in the queue. That one is easy to handle, however. We just have the job decorator Job.__run_perform check to system state before it actually runs the job. If the system is in a "soft" shutdown state the decorator doesn't post the job. If it's in the "none" shutdown state it allows the job to post. In both cases we use the thread pool #shutdown method then wait for termination.

This is what I'll spike. With this approach I can handle both the race condition I mentioned earlier and handle aggregate wait time better. I should have a PR in later this evening.

brandonhilkert commented 8 years ago

@jdantonio Got it! With regards to the different states, it's be cool to be able to document those, but I want to ship with one default shutdown handler, which is the "soft" state mentioned earlier. I can throw the others up in the README if others want different modes, but I don't think I want to ship it with them built-in.

brandonhilkert commented 8 years ago

@jdantonio I'm thinking about the shutdown modes again and the current state waits for 8 seconds. If I think about the use cases, I can't imagine a scenario where someone would want to wait infinitely for jobs to completely. Likewise, the hard mode we initially discussed ran kill against all of them.

I'm thinking I build in a configuration for shutdown_timeout, default to 8 (for heroku) and then allow people to set that themselves. With a really small timeout (< 1), it almost gives you the hard model behavior that runs kill instantly. Setting to a very large timeout, would let the queues run through the remaining jobs.

I'd still allow users to configure the shutdown_handler if they chose to, but figure this would cover 99% of the use cases.

In your experience, you think allowing configuration of the shutdown_handler and shutdown_timeout would cover it for most?

jdantonio commented 8 years ago

My guess is that shutdown_timeout is all you need. I'm a little wary of letting your users configure a custom shutdown_handler at first. That's pretty advanced stuff and it requires knowing the internals of both sucker_punch and concurrent-ruby. As I've learned the hard way, it's easier to add features than take them away. Perhaps start with just shutdown_timeout and add more options later once people start using it and discover more use cases?

brandonhilkert commented 8 years ago

Good idea. I think in the beginning, i was thinking the "other" modes would be more complex, but your implementation seems like it'd be the same in all cases, just depends on how much of a grace period you allow for the queues to empty. I'll remove it the configuration option and add the setting of the timeout.

jdantonio commented 8 years ago

If you decide to implement the mode for finishing all jobs in the queue, I think it should pretty easy. Just change the RUNNING constant from a boolean to an AtomicReference and set the value to a symbol (:running, :hardstop, :softstop, :gentlestop, whatever), use compare_and_set to change it, then update the job processing function to run/skip the job based on the symbol. I haven't coded that but I think it should work.