brandonhilkert / sucker_punch

Sucker Punch is a Ruby asynchronous processing library using concurrent-ruby, heavily influenced by Sidekiq and girl_friday.
MIT License
2.65k stars 114 forks source link

Thread getting stuck in rails #209

Closed Farjad closed 6 years ago

Farjad commented 6 years ago

Hi,

I am not sure if this is an issue or that I am just doing something wrong.

I am querying 3 databases in my SuckerPunch worker for a rails app.

All it really does is if it can connect to those databases and updates a hash which is then returned in a '/health' endpoint on rails. However, if it cannot connect to the database, the whole thread gets stuck until the connection times out. And since this is happening every 10s, the rails end of the process is stuck.

Is there something I'm missing or is this supposed to happen?

Thank you! Farjad

brandonhilkert commented 6 years ago

Can you share any code?

Farjad commented 6 years ago

Yup, so this is my worker:


  class Job
    include SuckerPunch::Job

    def perform(response)
      begin
        check_health(response)
      rescue Exception => e
        Rails.logger.error(e)
      end

      Healthcheck::Job.perform_in(HEALTH_CHECKER_INTERVAL, response)

      ''
    end

    def check_health(response)
      response.tap do |r|
        r.last_time = Time.now
        r.racc = first?(Company)
        r.vcq = first?(QueueConfiguration)
        r.recording = first?(MediaFile)
      end
    end

    protected
    include RenderHelper

    private
    def first?(klass)
      #ActiveRecord::Base.connection_pool.with_connection do
        klass.first
      #end
      true
    rescue Exception => e
      log_exception(e, "Status 500 ERROR: Could not read first #{klass} from database.")
      false
    end
  end
end```
brandonhilkert commented 6 years ago

If there is an error, can you provide the stack trace?

Farjad commented 6 years ago

So there isn't an error per say, it just blocks the whole process (thin, unicorn) if the database is down.

Status 500 ERROR: Could not read first MediaFile from database. {:error=>"Unable to connect: Adaptive Server is unavailable or does not exist (sdwsql01:1433)", :backtrace=>["/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/tiny_tds-1.3.0/lib/tiny_tds/client.rb:53:in connect'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/tiny_tds-1.3.0/lib/tiny_tds/client.rb:53:ininitialize'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/connection_adapters/sqlserver_adapter.rb:291:in new'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/connection_adapters/sqlserver_adapter.rb:291:indblib_connect'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/connection_adapters/sqlserver_adapter.rb:280:in connect'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/connection_adapters/sqlserver_adapter.rb:62:ininitialize'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/sqlserver_base.rb:17:in new'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-sqlserver-adapter-4.1.2/lib/active_record/sqlserver_base.rb:17:insqlserver_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:435:in new_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:445:incheckout_new_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:416:in `acquire_connection'"]}

I actually am testing this - we have 3 databases that our application uses. Two of them do not impact the majority of the application and thus can be down.

So I added a blackhole route and we have a 30s timeout on the connection. So until the connection times out, the whole thread gets blocked. I cannot access the application and the worker is waiting for the connection to timeout.

brandonhilkert commented 6 years ago

I see you're using tiny_tds. I don't use sql server and don't have access to one personally. It feels like it's related to the adapter level connection pool access. You could remove all this stuff with the connection to confirm. Maybe the poll connection doesn't work like other active record adapters with re-establishing a connection.

Farjad commented 6 years ago

So I swapped the adapter to mysql2 and the same issue persists:

Status 500 ERROR: Could not read first MediaFile from database. {:error=>"Can't connect to MySQL server on 'sdwsql01' (60)", :backtrace=>["/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/mysql2-0.3.18/lib/mysql2/client.rb:70:in connect'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/mysql2-0.3.18/lib/mysql2/client.rb:70:ininitialize'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/mysql2_adapter.rb:18:in new'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/mysql2_adapter.rb:18:inmysql2_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:435:in new_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:445:incheckout_new_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:416:in acquire_connection'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:351:inblock in checkout'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/2.2.0/monitor.rb:211:in mon_synchronize'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:350:incheckout'", "/Users/fadamjee/.rbenv/versions/2.2.8/lib/ruby/gems/2.2.0/gems/activerecord-4.1.14.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:265:in `block in connection'"]}

In my testing - the database doesn't matter, I'm just trying to make sure that my rails app functions even though the suckerpunch worker may be blocked.

brandonhilkert commented 6 years ago

Gotcha. A job worker thread being blocked by a dead connection shouldn't stop the rest of the app from working, unless the rest of the app also relies on that same connection being available. I've use the library for a long time for API that went inaccessible and the website continued to work well. It does sound like that the rest of the app is dependent on that connection for some reason. Could it be some kind of instantiation thing you're not doing intentionally, but just tries to make the connection when a new request comes in?

Farjad commented 6 years ago

Yeah, so I was thinking ActiveRecord might be using the same thread across the application.

I know the database (that I am blocking) is not used on the pages I have tried.

But anyway, I'll try that and update the thread when I have something..

thanks!