fiber! users 100% of the cpu

jtoy commented 12 years ago

if I run this code with the bean.fiber! line, it uses 100% of the cpu, if I comment out the bean.fiber! , it barely uses any cpu.


require 'fiber'
require 'em-synchrony'
require 'em-jack'
    EM.synchrony do
    #EM.run do
        f = Fiber.new do
          bean = EMJack::Connection.new
          bean.fiber!
          #      bean.put("hello!")
          bean.each_job do |job|
            puts job
            job_body = job.body
            job.delete
          end

        end

      f.resume
    end

I'm on beanstalkd 1.6 ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0] em-jack (0.1.5) em-synchrony (1.0.0, 0.3.0.beta.1)

dj2 commented 12 years ago

That's an interesting one. @igrigorik any ideas why using the fiber version would spin the cpu at 100%?

igrigorik commented 12 years ago

Nope. What's the output? Are there any jobs, is it spinning in a loop, etc?

jtoy commented 12 years ago

There are no jobs, I tested this with an empty beanstalkd server and just the code above. I believe the issue is from constantly calling next_tick, I'm still investigating.

jtoy commented 12 years ago

The problem seems to be in each_job and next_tick. I wrote a modified each_job and this seems to work for me, if this is a valid fix I can submit a patch:

module EMJack
  class Connection
    def each_job2(timeout = nil, &blk)
        work = Proc.new do
          has_job = false
          Fiber.new do
            job = reserve(timeout)
            has_job = true
            blk.call(job)
          end.resume
          if has_job
            EM.next_tick { work.call }          
          else
            EM.add_timer 1, work
          end
        end

      work.call
    end
  end
end

dj2 commented 12 years ago

It looks close, when you set has_job to true, verify the job isn't null (the timeout may have triggered, in which case there was in fact no job.)

igrigorik commented 12 years ago

1) Why do we need to create a new fiber per call? If you're inside of a EM::Synchrony.run you should just use Fiber.current.

2) What's the point of the extra add_timer(1)? If there is no job, block on the socket and wait until a job is available, no need to inject extra random delays.

3) Given 1+2, I think has_job is redundant.

jtoy commented 12 years ago

So calling bean.fiber! already modifies the methods to use Fiber.current. I tested this code and it works, I needed to wrap the next_tick in a fiber, I can put this code in a patch:

module EMJack
  class Connection
    def each_job2(timeout = nil, &blk)
      if (@fiberized)
        work = Proc.new do
          job = reserve(timeout)
          blk.call(job)
          EM.next_tick { Fiber.new { work.call }.resume }
        end
      else
        work = Proc.new do
          r = reserve(timeout)
          r.callback do |job|
            blk.call(job)
            EM.next_tick { work.call }
          end
          r.errback do
            EM.next_tick { work.call }
          end
        end
      end      
      work.call
    end
  end
end

igrigorik commented 12 years ago

This API will give you an "async" each_job, which may or may not be what you want. Ex: you call each_job, it schedules a timer and you keep going to your next code block. An alternative API (arguably the expected API), would be to iterate on the jobs until there are no jobs left and then proceed. In which case, EM.next_tick is redundant (yes, you would be monopolizing the reactor).

What is the regular beanstalk API around this? I'm guessing you want the same behavior.

jtoy commented 12 years ago

I can remove the next_tick, although I personally would use the next_tick version so as not to monopolize the reactor.

igrigorik / em-jack

fiber! users 100% of the cpu #14