brandonhilkert / sucker_punch

Sucker Punch is a Ruby asynchronous processing library using concurrent-ruby, heavily influenced by Sidekiq and girl_friday.
MIT License
2.64k stars 114 forks source link

can a job instance be used for multiple jobs? #46

Closed bughit closed 10 years ago

bughit commented 10 years ago

something like this:

class AsyncJobs
  include SuckerPunch::Job

  def job_a
  end

  def job_b
  end
end

ASYNC_JOBS = AsyncJobs.new.async

ASYNC_JOBS.job_a

ASYNC_JOBS.job_b

or is there something special about the perform method?

can job instances be long lived (ASYNC_JOBS global constant in the above example), and would there be any forking issues?

what are the implications of the MRI GIL? I guess potentially all the threads will run in serialized context switched time slices?

brandonhilkert commented 10 years ago

There is nothing special about perform. I wouldn't instantiate the job before you actually use it. Forking's not an issue, but passing a Global around isn't recommended.

Your thought about the GIL is accurate. However, you should see some benefit just from the fact that certain jobs will be waiting for I/O.

bughit commented 10 years ago

I wouldn't instantiate the job before you actually use it. passing a Global around isn't recommended.

are these stylistic recommendations or will there be problems?

In my example the job is being used like a singleton and serves multiple jobs, it can be instantiated lazily and left, there doesn't seem to be any reason that a new instance is required for each async method invocation, is there?

brandonhilkert commented 10 years ago

Both. Sucker Punch hooks in to the .new method on the job class so that it can setup the queue pool. If the job class has previous been instantiated, it adds to the queue, otherwise, creates a celluloid repository to keep track of the queue and adds job.

The benefit of this is that for forking servers, it can fork, and then only when the job is actually run do the queues get set up. In a previous version, the queues were setup in initializers, which required additional connection code in the the after_fork blocks of those forking servers.

So to answer your question specifically, if you setup the global in an initializer and you're using a forking server, you'll have problems after the fork. The changes in version 1 were made to accommodate those servers, so my recommendation is to instantiate the class when you need it and not assign an instance to a global. Without seeing more code, I can't comment on how effective your global tactic will be.

bughit commented 10 years ago

I haven't tried this yet, would like to get you thoughts on the following generic approach:

class DefaultWorkerPool

  include SuckerPunch::Job

  execute_block_on_receiver :perform, *execute_block_on_receiver

  def perform(*args)
    yield *args
  end

end

with the understanding that you don't use any objects closed over, but only simple args passed to the block

bughit commented 10 years ago

Celluloid does not support blocks with async.

brandonhilkert commented 10 years ago

Is there a reason you don't want to use it as it's documented?

bughit commented 10 years ago

does one create a separate class every time one wants to execute a block of code synchronously?

then why would one necessarily want to be so ceremonious for async?

class WorkerPool

  include SuckerPunch::Job

  execute_block_on_receiver :perform

  def perform(*args)
    yield *args
  end

  def self.enqueue(*args, &block)
    new.async(:perform, *args, &block)
 end

end

WorkerPool.enqueue user_id do |user_id|
  u = User.find(user_id)
  whatever u
end

I'd do it this way, if it worked.

bughit commented 10 years ago

will probably work with a lambda instead of a block, will try it

bughit commented 10 years ago

it does

class WorkerPool

  include SuckerPunch::Job

  def perform(proc, *args)
    proc.(*args)
  end

  def self.enq(*args, &block)
    new.async(:perform, block, *args)
 end

end

WorkerPool.enq user_id do |user_id|
  u = User.find(user_id)
  whatever u
end
brandonhilkert commented 10 years ago

I'm still not sure I see the value in all that indirection to dynamically run asynchronous code. Seems like the code you show above would be difficult to easily test.

To answer your previous question, I create separate job class for each method I will run asynchronously. My code looks exactly like the README. This is inline with other background job libraries (Resque, Sidekiq, etc...) and promotes SRP.

bughit commented 10 years ago

what is "all that indirection" referring to? I posted the minimal code required to enqueue a block of code for async invocation by a worker pool, it's pretty simple.

your best practice, SRP, testability, argument can also be applied to synchronous blocks

perhaps it's time to start replacing blocks with separate functor classes

brandonhilkert commented 10 years ago

your best practice, SRP, testability, argument can also be applied to synchronous blocks

Yup, and I'd like to believe my synchronous code is testable and follows SRP as well. I'd love to see your test for the code you wrote above.

bughit commented 10 years ago

it's not my goal to change your mind about the primary usage pattern of your lib, I found an alternative one that I like, you can ignore if it's offensive to you, or mention it as a possibility, I think some will find utility in it.

brandonhilkert commented 10 years ago

Not offensive at all, just interested in the testability. What library are you using?