thiagopradi / octopus

Database Sharding for ActiveRecord
2.53k stars 505 forks source link

Rails 5.2 clear_active_connections!: undefined method `delete' for nil:NilClass #519

Open januszm opened 5 years ago

januszm commented 5 years ago

Also: undefined method 'any?', 'clear' or 'select' for nil:NilClass UPDATE: It looks like Octopus is trying to call clear_all_connections and clear_active_connections when connections are already discarded. When with_each_healthy_shard(&:disconnect!) is executedd, the @connections and @thread_cached_conns are already nil (set by discard! method on ConnectionPool class).

Background of the story: I noticed that background jobs (custom made solution using AMQP and Beetle gem, not ActiveJob) occasionally log error message Error on shard master: undefined method 'delete' for nil:NilClass , even though they run on shard slave.

Backtrace:

activerecord-5.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:402:in `release_connection'
ar-octopus-0.10.2/lib/octopus/proxy.rb:239:in `block in with_each_healthy_shard'
ar-octopus-0.10.2/lib/octopus/proxy.rb:237:in `each'
ar-octopus-0.10.2/lib/octopus/proxy.rb:237:in `with_each_healthy_shard'
ar-octopus-0.10.2/lib/octopus/proxy.rb:158:in `clear_active_connections!'
ar-octopus-0.10.2/lib/octopus/model.rb:178:in `clear_active_connections_with_octopus!'

This error message comes from lines (~ 242):

    def with_each_healthy_shard
      shards.each do |shard_name, v|
        begin
          yield(v)
        rescue => e
          if Octopus.robust_environment?
            Octopus.logger.error "Error on shard #{shard_name}: #{e.message}"
          else
            raise
          end
        end

in Octopus::Proxy, and the code above seems to be used in methods that clear connections or query_cache.

There are 2 issues with this code:

  1. it rescues exceptions but does not provide any "hook" which could be used to report those errors with tools like Sentry, Bugsnag, Airbrake etc.
  2. the error log says almost nothing, I mean, I can suspect that it's about deleting something, like a connection ;) , but having a backtrace here would help a lot since we're rescuing the error.

To address 1., perhaps we could add something like (not a production code):

# Octopus.setup do |config|
#   config.exception_reporter = Raven.capture_exception
# def report_exception?; exception_reporter && exception_reporter.respond_to?(:call); end
rescue => e
  # report_exception can be false or something that responds to .call
   Octopus.config.exception_reporter.call(e) if Octopus.config.report_exception?
  if Octopus.robust_environment?
  ...

To address 2.:

...
if Octopus.robust_environment?
  Octopus.logger.error "Error on shard #{shard_name}: #{[e.message, *e.backtrace].join($INPUT_RECORD_SEPARATOR)}"
januszm commented 5 years ago

Similar issue:

Error on shard master: undefined method `clear' for nil:NilClass
NoMethodError(undefined method `any?' for nil:NilClass):
activerecord-5.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `block in connected?'
ruby/2.6.0/monitor.rb:230:in `mon_synchronize'
activerecord-5.2.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:423:in `connected?'
ar-octopus-0.10.2/lib/octopus/proxy.rb:73:in `safe_connection'
ar-octopus-0.10.2/lib/octopus/proxy.rb:80:in `select_connection'
ar-octopus-0.10.2/lib/octopus/proxy.rb:25:in `schema_cache'
activerecord-5.2.3/lib/active_record/model_schema.rb:466:in `load_schema!'
activerecord-5.2.3/lib/active_record/attributes.rb:234:in `load_schema!'
activerecord-5.2.3/lib/active_record/attribute_decorators.rb:51:in `load_schema!'
activerecord-5.2.3/lib/active_record/model_schema.rb:459:in `block in load_schema'
ruby-2.6.2/lib/ruby/2.6.0/monitor.rb:230:in `mon_synchronize'
januszm commented 5 years ago

When it happens, the following variables are set to nil on ActiveRecord::ConnectionAdapters::ConnectionPool (variable v in with_each_healthy_shard method):

@connections = nil
@thread_cached_conns = nil

instead of:

@connections=[...]
@thread_cached_conns=#<Concurrent::Map:... entries=0 default_proc=nil>,

It means that when Octopus tries to clear_all_connections or clear_active_connections these connections were already discarded. See: https://github.com/rails/rails/blob/v5.2.3/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L469 (sets these variables to nil)

januszm commented 5 years ago

How about:

def with_each_healthy_shard
  shards.each do |shard_name, v|
    next if v.instance_variable_get(:@thread_cached_conns).nil?
    begin
      yield(v)

?

redox commented 5 years ago

Running into similar issues, did someone figured out a patch/fork?

januszm commented 4 years ago

I'm curious if this: https://github.com/rails/rails/pull/36473 improves the situation. I didn't have a chance to test it though.

wjessop commented 4 years ago

I've just hit this issue after upgrading someone to Rails 5.2:

Message from application: undefined method `clear' for nil:NilClass (NoMethodError)
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:780:in `block in with_new_connections_blocked'
/usr/local/lib/ruby/2.4.0/monitor.rb:214:in `mon_synchronize'
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:776:in `ensure in with_new_connections_blocked'
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:793:in `with_new_connections_blocked'
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:695:in `with_exclusively_acquired_all_connections'
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:448:in `disconnect'
/.bundle/gems/activerecord-5.2.4.3/lib/active_record/connection_adapters/abstract/connection_pool.rb:470:in `disconnect!'
/.bundle/gems/ar-octopus-0.10.2/lib/octopus/proxy.rb:239:in `block in with_each_healthy_shard'
/.bundle/gems/ar-octopus-0.10.2/lib/octopus/proxy.rb:237:in `each'
/.bundle/gems/ar-octopus-0.10.2/lib/octopus/proxy.rb:237:in `with_each_healthy_shard'
/.bundle/gems/ar-octopus-0.10.2/lib/octopus/proxy.rb:162:in `clear_all_connections!'
/.bundle/gems/ar-octopus-0.10.2/lib/octopus/model.rb:186:in `clear_all_connections_with_octopus!'
/usr/local/bundle/gems/passenger-enterprise-server-5.2.3/src/ruby_supportlib/phusion_passenger/loader_shared_helpers.rb:439:in `before_handling_requests'
/usr/local/bundle/gems/passenger-enterprise-server-5.2.3/src/helper-scripts/rack-preloader.rb:118:in `negotiate_spawn_command'
/usr/local/bundle/gems/passenger-enterprise-server-5.2.3/src/helper-scripts/rack-preloader.rb:142:in `<module:App>'
/usr/local/bundle/gems/passenger-enterprise-server-5.2.3/src/helper-scripts/rack-preloader.rb:14:in `<module:PhusionPassenger>'
/usr/local/bundle/gems/passenger-enterprise-server-5.2.3/src/helper-scripts/rack-preloader.rb:13:in `<main>'

Did anyone have any luck with the patch? Did that ever get added backported to 5.2?

edit Unfortunately rails/rails#36473 is included in Rails 5.2 where we are seeing the issue so it doens't look like it helps here.