mysql connection lost from sidekiq job running on puma on heroku

ghost commented 10 years ago

I've been using a sidekiq process embedded in a unicorn dyno on heroku with some success for a while now, but I'm having trouble getting this approach after switching to puma (version 2.8). Puma has a clustered mode in which it can fork workers, and in this fork I spawn a sidekiq process (same as I had done for unicorn). However, now my mysql connection is getting dropped frequently shooing connection dropped and bad file descriptor errors. It does not appear to have anything to do with heroku putting the dyno to sleep. What I don't understand is why spawning a sidekiq worker out of a unicorn process would be any different than spawning one out of a puma process? Would it matter if I did it before or after rails was loaded? Any tips would be very welcome. Thanks, Andrew

mperham commented 10 years ago

This isn't anything I have experience with but it sounds like you need to reset your AR connection after forking the Sidekiq process, like below. Maybe someone else can chime in with more knowledge.

Sidekiq.configure_server do
  ActiveRecord::Base.establish_connection
end

ghost commented 10 years ago

Thanks. I had played around with trying to make sure the connections all got reset. Now, it looks like maybe I don't need cluster mode at all so things might be simpler if sidekiq is just spawned from the puma master process. I just thought that putting it in the on_worker_hook would be cleaner, even though I only have one worker.

Here was my original puma.rb file that was having problems:

# config/puma.rb
threads ENV['PUMA_MIN_THREADS'] || 2, ENV['PUMA_MAX_THREADS'] || 4
workers 1
port ENV['PORT']

on_worker_boot do
  if defined?(ActiveRecord)
    ActiveRecord::Base.connection_pool.disconnect!
  end

  if ENV['NUM_SIDEKIQ_WORKERS'].to_i > 0
    @sidekiq_pid ||= spawn("bundle exec sidekiq -c #{ENV['NUM_SIDEKIQ_WORKERS']}")
    jobs = {}
    jobs["Worker"] = @sidekiq_pid

    unless ENV['RACK_ENV'] == 'development' || !ENV['ENABLE_SIDEKIQ_NEWRELIC_PLUGIN']
      @sidekiq_monitor_pid ||= spawn("bundle exec ./script/newrelic_sidekiq_with_memory_agent #{@sidekiq_pid}")
      jobs["Sidekiq NewRelic Plugin"] = @sidekiq_monitor_pid
    end

    jobs.each do |name, pid|
      t = Thread.new {
        Process.wait(pid)
        puts "#{name} died. Bouncing puma."
        Process.kill 'QUIT', Process.pid
      }
      # Just in case
      t.abort_on_exception = true
    end
  end

  if defined?(ActiveRecord)
    if Rails.application.config.database_configuration
      config = Rails.application.config.database_configuration[Rails.env]
      config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
      config['pool']              = ENV['DB_POOL']      || 5
      ActiveRecord::Base.establish_connection(config)
    end
  end
end

Here's a simpler version without clustered mode (i.e. no workers field) that looks like it might be working so far (although it has to run for a while to be sure):

if ENV['NUM_SIDEKIQ_WORKERS'].to_i > 0
  @sidekiq_pid ||= spawn("bundle exec sidekiq -c #{ENV['NUM_SIDEKIQ_WORKERS']}")
  jobs = {}
  jobs["Worker"] = @sidekiq_pid

  unless ENV['RACK_ENV'] == 'development' || !ENV['ENABLE_SIDEKIQ_NEWRELIC_PLUGIN']
    @sidekiq_monitor_pid ||= spawn("bundle exec ./script/newrelic_sidekiq_with_memory_agent #{@sidekiq_pid}")
    jobs["Sidekiq NewRelic Plugin"] = @sidekiq_monitor_pid
  end

  jobs.each do |name, pid|
    t = Thread.new {
      Process.wait(pid)
      puts "#{name} died. Bouncing puma."
      Process.kill 'QUIT', Process.pid
    }
    # Just in case
    t.abort_on_exception = true
  end
end

threads ENV['PUMA_MIN_THREADS'] || 2, ENV['PUMA_MAX_THREADS'] || 4
port ENV['PORT']

To give credit where it's due, this work all derives from Free Background Jobs on Heroku by dommel

ghost commented 10 years ago

BTW, that sidekiq monitor was something I was working on to report sidekiq memory usage to newrelic and is purely incidental (it's not even enabled on my heroku app).

ghost commented 10 years ago

Forgot to include config/initializers/database_connection.rb, which I assumed was being used to reinitialize the activerecord connections after the disconnect.

Sidekiq.configure_client do |config|
  config.redis = { size: 2 }
  Rails.application.config.after_initialize do
    ActiveRecord::Base.connection_pool.disconnect!

    ActiveSupport.on_load(:active_record) do
      if Rails.application.config.database_configuration
        config = Rails.application.config.database_configuration[Rails.env]
        config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
        config['pool']              = ENV['DB_POOL']      || 5
        ActiveRecord::Base.establish_connection(config)
      end
    end
  end
end

Sidekiq.configure_server do |config|
  config.redis = { size: 2 }
  Rails.application.config.after_initialize do
    ActiveRecord::Base.connection_pool.disconnect!

    ActiveSupport.on_load(:active_record) do
      if Rails.application.config.database_configuration
        config = Rails.application.config.database_configuration[Rails.env]
        config['reaping_frequency'] = ENV['DB_REAP_FREQ'] || 10 # seconds
        config['pool']              = 2
        ActiveRecord::Base.establish_connection(config)
      end
    end
  end
end

mperham commented 10 years ago

I assume clustered mode tries to share memory by forking child workers whereas the non-clustered mode doesn't, meaning it doesn't share file descriptors thus fixing your problem.

sidekiq / sidekiq

mysql connection lost from sidekiq job running on puma on heroku #1535