newrelic / newrelic-ruby-agent

New Relic RPM Ruby Agent
https://docs.newrelic.com/docs/apm/agents/ruby-agent/getting-started/introduction-new-relic-ruby/
Apache License 2.0
1.19k stars 599 forks source link

Delayed job deadlocks in development #711

Closed jdelStrother closed 1 year ago

jdelStrother commented 3 years ago

Description

When running rake jobs:work in development mode with a newrelic-enabled app, the main thread & the newrelic sampling thread eventually deadlock each other. No more jobs get run, and the process becomes unable to exit on ctrl-c.

Expected Behavior

No deadlocks! They're not usually expected.

Troubleshooting or NR Diag results

Provide any other relevant log data.

Here's the main thread following a deadlock:

lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `block in mon_enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `mon_enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:22:in `block in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:597:in `checkin'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:551:in `block (3 levels) in clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `each'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `block (2 levels) in clear_reloadable_connections'
lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:547:in `block in clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:749:in `block in with_exclusively_acquired_all_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:824:in `with_new_connections_blocked'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:747:in `with_exclusively_acquired_all_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:546:in `clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:570:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `each'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_handling.rb:351:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/railtie.rb:239:in `block (3 levels) in <class:Railtie>'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:427:in `instance_exec'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:427:in `block in make_lambda'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:603:in `catch'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:603:in `block in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:199:in `block in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `block in invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `each'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:105:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:120:in `class_unload!'
lib/ruby/gems/2.7.0/gems/railties-6.1.4/lib/rails/application/finisher.rb:244:in `block (2 levels) in <module:Finisher>'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:427:in `instance_exec'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:427:in `block in make_lambda'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:603:in `catch'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:603:in `block in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:199:in `block in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `block in invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `each'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:512:in `invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/callbacks.rb:105:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/execution_wrapper.rb:111:in `run!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:114:in `run!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:53:in `block (2 levels) in reload!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:52:in `tap'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:52:in `block in reload!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/execution_wrapper.rb:88:in `wrap'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/reloader.rb:51:in `reload!'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:330:in `reload!'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:187:in `block (2 levels) in start'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `loop'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `block in start'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/tagged_logging.rb:99:in `block in tagged'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/tagged_logging.rb:37:in `tagged'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/tagged_logging.rb:99:in `tagged'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/plugins/clear_locks.rb:7:in `block (2 levels) in <class:ClearLocks>'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:61:in `block in initialize'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:66:in `execute'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:40:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:171:in `start'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/tasks.rb:9:in `block (2 levels) in <main>'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:281:in `block in execute'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:281:in `each'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:281:in `execute'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:219:in `block in invoke_with_call_chain'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:199:in `synchronize'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:199:in `invoke_with_call_chain'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/task.rb:188:in `invoke'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:160:in `invoke_task'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:116:in `block (2 levels) in top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:116:in `each'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:116:in `block in top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:125:in `run_with_threads'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:110:in `top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:83:in `block in run'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:186:in `standard_exception_handling'
lib/ruby/gems/2.7.0/gems/rake-13.0.3/lib/rake/application.rb:80:in `run'
bin/rake:6:in `<main>'

and NewRelic's agent thread:

lib/ruby/2.7.0/monitor.rb:108:in `sleep'
lib/ruby/2.7.0/monitor.rb:108:in `wait'
lib/ruby/2.7.0/monitor.rb:108:in `wait_for_cond'
lib/ruby/2.7.0/monitor.rb:108:in `wait'
lib/ruby/2.7.0/monitor.rb:116:in `wait_while'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/share_lock.rb:220:in `wait_for'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/share_lock.rb:190:in `block in yield_shares'
lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/share_lock.rb:189:in `yield_shares'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:205:in `block in execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_adapter.rb:696:in `block (2 levels) in log'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:26:in `block (2 levels) in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_adapter.rb:695:in `block in log'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_adapter.rb:687:in `log'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:204:in `execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/mysql/database_statements.rb:50:in `execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:215:in `execute_and_free'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/mysql/database_statements.rb:55:in `exec_query'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/database_statements.rb:532:in `select'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/database_statements.rb:69:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/abstract/query_cache.rb:103:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/connection_adapters/mysql/database_statements.rb:12:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation/calculations.rb:308:in `block in execute_simple_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation.rb:861:in `skip_query_cache_if_necessary'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation/calculations.rb:308:in `execute_simple_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation/calculations.rb:269:in `perform_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation/calculations.rb:144:in `calculate'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/instrumentation/active_record_prepend.rb:85:in `block in calculate'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent.rb:681:in `with_database_metric_name'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/instrumentation/active_record_prepend.rb:84:in `calculate'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4/lib/active_record/relation/calculations.rb:51:in `count'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:56:in `count'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_silence.rb:18:in `block in silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_thread_safe_level.rb:52:in `log_at'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_silence.rb:18:in `silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger.rb:64:in `block (3 levels) in broadcast'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_silence.rb:18:in `block in silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_thread_safe_level.rb:52:in `log_at'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger_silence.rb:18:in `silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4/lib/active_support/logger.rb:62:in `block (2 levels) in broadcast'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:45:in `failed_jobs'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:65:in `poll'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/sampler_collection.rb:41:in `block in poll_samplers'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/sampler_collection.rb:39:in `delete_if'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/sampler_collection.rb:39:in `poll_samplers'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/sampler_collection.rb:13:in `block in initialize'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_listener.rb:40:in `block in notify'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_listener.rb:38:in `each'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_listener.rb:38:in `notify'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:1143:in `transmit_data'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:619:in `block in create_and_run_event_loop'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_loop.rb:148:in `block in dispatch_event'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_loop.rb:146:in `each'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_loop.rb:146:in `dispatch_event'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_loop.rb:111:in `run_once'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/event_loop.rb:99:in `run'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:641:in `create_and_run_event_loop'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:701:in `block (2 levels) in deferred_work!'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent.rb:501:in `disable_all_tracing'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:698:in `block in deferred_work!'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:678:in `catch_errors'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:697:in `deferred_work!'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/agent.rb:724:in `block in start_worker_thread'
lib/ruby/gems/2.7.0/gems/newrelic_rpm-7.1.0/lib/new_relic/agent/threading/agent_thread.rb:14:in `block in create'

Steps to Reproduce

I found this helpful for narrowing down the problem:

NewRelic::Agent::Samplers::DelayedJobSampler.prepend(
  Module.new {
    def count(query)
      puts "#{Time.now} Fetching count #{query}"
      t = nil
      Rails.logger.silence do
        t = super
      end
      puts "#{Time.now} Count: #{t}"
      t
    rescue Exception => e
      puts e.inspect
      raise
    end
  }
)

NewRelic::Agent::EventLoop.prepend(
  Module.new {
    def fire_every(interval, event)
      interval = interval / 10
      puts "Firing event #{event} every #{interval} seconds."
      super(interval, event)
    end
  }
)

which logs before and after the DJ sampling occurs, and makes the sampling more frequent so you don't have to wait hours for it to occur. When the deadlock occurs you'll see something like

2021-07-01 11:55:30 +0100 Fetching count failed_at is not NULL
2021-07-01 11:55:30 +0100 Count: 4
2021-07-01 11:55:30 +0100 Fetching count locked_by is not NULL

in the log, followed by no other output.

Your Environment

Newrelic agent 7.1.0 Ruby 2.7.3 Rails 6.1.4 macOS 11.4

Additional context

I wonder if the DJ sampling needs wrapping in Rails.application.executor, but not totally sure.

tannalynn commented 3 years ago

hello @jdelStrother Thanks for letting us know about this issue.

Following your instructions and using the code provided, I was able to replicate some weird behavior, so this does appear to be something we'll want to take a closer look at.

When I was running it, it would output

Firing event report_data every 6 seconds.
Firing event report_event_data every 6 seconds.
Firing event reset_log_once_keys every 6.0 seconds.
2021-07-02 12:49:27 -0700 Fetching count failed_at is not NULL
2021-07-02 12:49:27 -0700 Count: 0
2021-07-02 12:49:27 -0700 Fetching count locked_by is not NULL
2021-07-02 12:49:27 -0700 Count: 0
2021-07-02 12:49:33 -0700 Fetching count failed_at is not NULL

and then be deadlocked and unable to ctrl-c out of it.

This appears to be a long standing issue, as I also tried with some older agent versions and saw the same behavior. This was surprising to me since we haven't received any other reports of this issue.

I'll put this in our next bug smash, so that when we get started on the bug smash we'll take a closer look at solving this issue. Thanks for letting us know!

kaylareopelle commented 2 years ago

Hi @jdelStrother, @tannalynn and I took a look at this today.

We're still narrowing down the cause of the issue, however, we noticed in our playground environment that the deadlock went away when updating our configuration to disable the DelayedJob sampler:

# newrelic.yml 

  disable_delayed_job_sampler: true

This is a potential temporary workaround until we identify the root cause. Thanks for your patience!

kaylareopelle commented 2 years ago

Hi @jdelStrother, we may have a fix for the deadlock issue on this branch. Feel free to update your Gemfile with the following if you'd like to see if it helps:

gem 'newrelic_rpm', :git => 'https://github.com/newrelic/newrelic-ruby-agent.git', :branch => 'bugfix/delayed-job-deadlocks'

jdelStrother commented 2 years ago

Sorry for the slow reply, I've been away for a bit. I've been running this for 30 minutes or so which previously has been more than enough to trigger the bug with my hack to increase the sampling rate, and it seems fixed 🎉

Will continue running it tomorrow and let you know if I manage to get a deadlock

jdelStrother commented 2 years ago

Ugh, spoke too soon. I left it running overnight & got a deadlock after 3 hours or so. 😞

jdelStrother commented 2 years ago

... similar backtraces again:

Main thread:

lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `block in mon_enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `mon_enter'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:22:in `block in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:597:in `checkin'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:551:in `block (3 levels) in clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `each'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `block (2 levels) in clear_reloadable_connections'
lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:547:in `block in clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:749:in `block in with_exclusively_acquired_all_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:824:in `with_new_connections_blocked'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:747:in `with_exclusively_acquired_all_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:546:in `clear_reloadable_connections'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:570:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `each'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_handling.rb:351:in `clear_reloadable_connections!'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/railtie.rb:239:in `block (3 levels) in <class:Railtie>'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `instance_exec'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `block in make_lambda'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `catch'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `block in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:199:in `block in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `block in invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `each'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:105:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:120:in `class_unload!'
lib/ruby/gems/2.7.0/gems/railties-6.1.4.1/lib/rails/application/finisher.rb:244:in `block (2 levels) in <module:Finisher>'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `instance_exec'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `block in make_lambda'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `catch'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `block in default_terminator'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:199:in `block in halting'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `block in invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `each'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `invoke_before'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:105:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/execution_wrapper.rb:111:in `run!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:114:in `run!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:53:in `block (2 levels) in reload!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:52:in `tap'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:52:in `block in reload!'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/execution_wrapper.rb:88:in `wrap'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:51:in `reload!'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:330:in `reload!'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:187:in `block (2 levels) in start'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `loop'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `block in start'
RAILS_ROOT/lib/delayed_job/logging.rb:42:in `block (3 levels) in <class:TaggedLogging>'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/tagged_logging.rb:99:in `block in tagged'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/tagged_logging.rb:37:in `tagged'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/tagged_logging.rb:99:in `tagged'
RAILS_ROOT/lib/delayed_job/logging.rb:41:in `block (2 levels) in <class:TaggedLogging>'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/plugins/clear_locks.rb:7:in `block (2 levels) in <class:ClearLocks>'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:61:in `block in initialize'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block in add'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:66:in `execute'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:40:in `run_callbacks'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:171:in `start'
lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/tasks.rb:9:in `block (2 levels) in <main>'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `block in execute'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `each'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `execute'
lib/ruby/gems/2.7.0/gems/sentry-ruby-core-4.7.3/lib/sentry/rake.rb:23:in `block in execute'
lib/ruby/gems/2.7.0/gems/sentry-ruby-core-4.7.3/lib/sentry/hub.rb:158:in `with_background_worker_disabled'
lib/ruby/gems/2.7.0/gems/sentry-ruby-core-4.7.3/lib/sentry/rake.rb:22:in `execute'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:219:in `block in invoke_with_call_chain'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `synchronize'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `invoke_with_call_chain'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:188:in `invoke'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:160:in `invoke_task'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block (2 levels) in top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `each'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block in top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:125:in `run_with_threads'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:110:in `top_level'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:83:in `block in run'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:186:in `standard_exception_handling'
lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:80:in `run'
    bin/rake:6:in `<main>'

Agent thread:

lib/ruby/2.7.0/monitor.rb:108:in `sleep'
lib/ruby/2.7.0/monitor.rb:108:in `wait'
lib/ruby/2.7.0/monitor.rb:108:in `wait_for_cond'
lib/ruby/2.7.0/monitor.rb:108:in `wait'
lib/ruby/2.7.0/monitor.rb:116:in `wait_while'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:220:in `wait_for'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:190:in `block in yield_shares'
lib/ruby/2.7.0/monitor.rb:202:in `synchronize'
lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:189:in `yield_shares'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:205:in `block in execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:696:in `block (2 levels) in log'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:26:in `block (2 levels) in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:695:in `block in log'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/notifications/instrumenter.rb:24:in `instrument'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:687:in `log'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:204:in `execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:50:in `execute'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:215:in `execute_and_free'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:55:in `exec_query'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:532:in `select'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:69:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/query_cache.rb:103:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:12:in `select_all'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:308:in `block in execute_simple_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation.rb:861:in `skip_query_cache_if_necessary'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:308:in `execute_simple_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:269:in `perform_calculation'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:144:in `calculate'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/instrumentation/active_record_prepend.rb:85:in `block in calculate'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent.rb:682:in `with_database_metric_name'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/instrumentation/active_record_prepend.rb:84:in `calculate'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:51:in `count'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/samplers/delayed_job_sampler.rb:56:in `count'
RAILS_ROOT/lib/delayed_job_hacks.rb:45:in `block in count'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `block in silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_thread_safe_level.rb:52:in `log_at'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger.rb:64:in `block (3 levels) in broadcast'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `block in silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_thread_safe_level.rb:52:in `log_at'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `silence'
lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger.rb:62:in `block (2 levels) in broadcast'
RAILS_ROOT/lib/delayed_job_hacks.rb:44:in `count'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/samplers/delayed_job_sampler.rb:45:in `failed_jobs'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/samplers/delayed_job_sampler.rb:67:in `block in poll'
lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:462:in `with_connection'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/samplers/delayed_job_sampler.rb:66:in `poll'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/sampler_collection.rb:41:in `block in poll_samplers'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/sampler_collection.rb:39:in `delete_if'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/sampler_collection.rb:39:in `poll_samplers'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/sampler_collection.rb:13:in `block in initialize'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_listener.rb:40:in `block in notify'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_listener.rb:38:in `each'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_listener.rb:38:in `notify'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:1143:in `transmit_data'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:619:in `block in create_and_run_event_loop'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_loop.rb:148:in `block in dispatch_event'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_loop.rb:146:in `each'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_loop.rb:146:in `dispatch_event'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_loop.rb:111:in `run_once'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/event_loop.rb:99:in `run'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:641:in `create_and_run_event_loop'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:701:in `block (2 levels) in deferred_work!'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent.rb:502:in `disable_all_tracing'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:698:in `block in deferred_work!'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:678:in `catch_errors'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:697:in `deferred_work!'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/agent.rb:724:in `block in start_worker_thread'
lib/ruby/gems/2.7.0/bundler/gems/newrelic-ruby-agent-07ee4cc80f48/lib/new_relic/agent/threading/agent_thread.rb:14:in `block in create'
kaylareopelle commented 2 years ago

Thank you, @jdelStrother for running a thorough test on the branch! We'll get back to work on a solution. Stay tuned!

kaylareopelle commented 2 years ago

@jdelStrother - There are references in the latest stacktrace to some files directly within RAILS_ROOT. If you do not feel comfortable sharing the code, could you share some context about what's happening in the following areas:

Another new element I see is sentry-ruby-core. Does the application use any custom configuration for sentry? Does it use the integrations for sentry-rails or sentry-delayed-job? All three of these gems may modify the same classes newrelic_rpm modifies, so it could be helpful to see the ancestors chain to peek at the order of operations.

Many thanks for your collaboration!

kaylareopelle commented 2 years ago

Two more questions/ideas:

jdelStrother commented 2 years ago

There are references in the latest stacktrace to some files directly within RAILS_ROOT [...] Another new element I see is sentry-ruby-core

I've disabled some of the RAILS_ROOT code, and removed Sentry, but was still able to reproduce the deadlock:

main thread ``` lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `enter' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `block in mon_enter' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:17:in `mon_enter' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:22:in `block in synchronize' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:597:in `checkin' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:551:in `block (3 levels) in clear_reloadable_connections' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `each' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:548:in `block (2 levels) in clear_reloadable_connections' lib/ruby/2.7.0/monitor.rb:202:in `synchronize' lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:547:in `block in clear_reloadable_connections' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:749:in `block in with_exclusively_acquired_all_connections' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:824:in `with_new_connections_blocked' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:747:in `with_exclusively_acquired_all_connections' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:546:in `clear_reloadable_connections' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:570:in `clear_reloadable_connections!' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `each' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/connection_pool.rb:1093:in `clear_reloadable_connections!' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_handling.rb:351:in `clear_reloadable_connections!' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/railtie.rb:239:in `block (3 levels) in ' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `instance_exec' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `block in make_lambda' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `catch' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `block in default_terminator' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:199:in `block in halting' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `block in invoke_before' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `each' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `invoke_before' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:105:in `run_callbacks' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:120:in `class_unload!' lib/ruby/gems/2.7.0/gems/railties-6.1.4.1/lib/rails/application/finisher.rb:244:in `block (2 levels) in ' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `instance_exec' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:427:in `block in make_lambda' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:198:in `block (2 levels) in halting' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:604:in `block (2 levels) in default_terminator' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `catch' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:603:in `block in default_terminator' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:199:in `block in halting' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `block in invoke_before' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `each' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:512:in `invoke_before' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/callbacks.rb:105:in `run_callbacks' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/execution_wrapper.rb:111:in `run!' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:114:in `run!' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:53:in `block (2 levels) in reload!' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:52:in `tap' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:52:in `block in reload!' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/execution_wrapper.rb:88:in `wrap' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/reloader.rb:51:in `reload!' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:330:in `reload!' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:187:in `block (2 levels) in start' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `loop' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:172:in `block in start' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/plugins/clear_locks.rb:7:in `block (2 levels) in ' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:61:in `block in initialize' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:79:in `block in add' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:66:in `execute' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/lifecycle.rb:40:in `run_callbacks' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/worker.rb:171:in `start' lib/ruby/gems/2.7.0/gems/delayed_job-4.1.9/lib/delayed/tasks.rb:9:in `block (2 levels) in
' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `block in execute' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `each' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:281:in `execute' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:219:in `block in invoke_with_call_chain' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `synchronize' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:199:in `invoke_with_call_chain' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/task.rb:188:in `invoke' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:160:in `invoke_task' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block (2 levels) in top_level' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `each' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:116:in `block in top_level' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:125:in `run_with_threads' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:110:in `top_level' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:83:in `block in run' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:186:in `standard_exception_handling' lib/ruby/gems/2.7.0/gems/rake-13.0.6/lib/rake/application.rb:80:in `run' lib/ruby/gems/2.7.0/gems/rake-13.0.6/exe/rake:27:in `' bin/rake:21:in `load' bin/rake:21:in `
' ```
agent thread ``` lib/ruby/2.7.0/monitor.rb:108:in `sleep' lib/ruby/2.7.0/monitor.rb:108:in `wait' lib/ruby/2.7.0/monitor.rb:108:in `wait_for_cond' lib/ruby/2.7.0/monitor.rb:108:in `wait' lib/ruby/2.7.0/monitor.rb:116:in `wait_while' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:220:in `wait_for' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:190:in `block in yield_shares' lib/ruby/2.7.0/monitor.rb:202:in `synchronize' lib/ruby/2.7.0/monitor.rb:202:in `mon_synchronize' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/share_lock.rb:189:in `yield_shares' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:205:in `block in execute' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:696:in `block (2 levels) in log' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:26:in `block (2 levels) in synchronize' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `handle_interrupt' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:25:in `block in synchronize' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `handle_interrupt' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/concurrency/load_interlock_aware_monitor.rb:21:in `synchronize' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:695:in `block in log' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/notifications/instrumenter.rb:24:in `instrument' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_adapter.rb:687:in `log' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:204:in `execute' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:50:in `execute' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:215:in `execute_and_free' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:55:in `exec_query' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:532:in `select' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/database_statements.rb:69:in `select_all' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/abstract/query_cache.rb:103:in `select_all' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/connection_adapters/mysql/database_statements.rb:12:in `select_all' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:308:in `block in execute_simple_calculation' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation.rb:861:in `skip_query_cache_if_necessary' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:308:in `execute_simple_calculation' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:269:in `perform_calculation' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:144:in `calculate' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/instrumentation/active_record_prepend.rb:85:in `block in calculate' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent.rb:682:in `with_database_metric_name' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/instrumentation/active_record_prepend.rb:84:in `calculate' lib/ruby/gems/2.7.0/gems/activerecord-6.1.4.1/lib/active_record/relation/calculations.rb:51:in `count' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:56:in `count' RAILS_ROOT/lib/delayed_job_hacks.rb:16:in `block in count' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `block in silence' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_thread_safe_level.rb:52:in `log_at' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `silence' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger.rb:64:in `block (3 levels) in broadcast' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `block in silence' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_thread_safe_level.rb:52:in `log_at' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger_silence.rb:18:in `silence' lib/ruby/gems/2.7.0/gems/activesupport-6.1.4.1/lib/active_support/logger.rb:62:in `block (2 levels) in broadcast' RAILS_ROOT/lib/delayed_job_hacks.rb:15:in `count' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:45:in `failed_jobs' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/samplers/delayed_job_sampler.rb:65:in `poll' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/sampler_collection.rb:41:in `block in poll_samplers' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/sampler_collection.rb:39:in `delete_if' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/sampler_collection.rb:39:in `poll_samplers' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/sampler_collection.rb:13:in `block in initialize' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_listener.rb:40:in `block in notify' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_listener.rb:38:in `each' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_listener.rb:38:in `notify' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:1143:in `transmit_data' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:619:in `block in create_and_run_event_loop' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_loop.rb:148:in `block in dispatch_event' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_loop.rb:146:in `each' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_loop.rb:146:in `dispatch_event' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_loop.rb:111:in `run_once' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/event_loop.rb:99:in `run' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:641:in `create_and_run_event_loop' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:701:in `block (2 levels) in deferred_work!' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent.rb:502:in `disable_all_tracing' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:698:in `block in deferred_work!' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:678:in `catch_errors' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:697:in `deferred_work!' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/agent.rb:724:in `block in start_worker_thread' lib/ruby/gems/2.7.0/gems/newrelic_rpm-8.0.0/lib/new_relic/agent/threading/agent_thread.rb:14:in `block in create' ```

delayed_job_hacks.rb just contains the hacks to try & speed up the deadlock:

NewRelic::Agent::Samplers::DelayedJobSampler.prepend(
  Module.new {
    def count(query)
      puts "#{Time.now} Fetching count #{query.inspect}..."
      t = nil
      Rails.logger.silence do
        t = super
      end
      puts "#{Time.now} Count: #{t}"
      t
    rescue Exception => e # rubocop:disable Lint/RescueException
      puts e.inspect
      raise
    end
  }
)

NewRelic::Agent::EventLoop.prepend(
  Module.new {
    def fire_every(interval, event)
      puts "Firing event #{event} every #{interval / 10} seconds."
      super(interval / 10, event)
    end
  }
)

you mention this is occurring in development mode. Does that refer to your development environment/local machine or something else? Do issues come up in other environments?

I only see it for RAILS_ENV=development, presumably because ActiveSupport::Dependencies.interlock is only active when config.cache_class=true.


I created a bare-bones app to try & reduce the problem further: https://github.com/jdelStrother/newrelic-dj-deadlocks The license key in newrelic.yml needs replacing with a real key[^1], then running rake jobs:work should hopefully eventually hit the deadlock. The deadlock takes significantly longer to reproduce with the bugfix/delayed-job-deadlocks branch compared to 8.0.0, but I still hit it occasionally. In the test app (which speeds up sampling 100x) it took maybe half an hour to reach.

[^1]: is there a test key that can be used to sample locally but not necessarily report to NewRelic?

kaylareopelle commented 2 years ago

Hi @jdelStrother, thanks for running those tests and sharing the playground! My apologies for the delayed response. I hoped to respond with a fix. Instead, here's an update:

We've narrowed down the source of the deadlock within the gem to this LOC:

# lib/new_relic/agent/delayed_job_sampler.rb
       def count(query)
          if ::ActiveRecord::VERSION::MAJOR.to_i < 4
            ::Delayed::Job.count(:conditions => query)
          else
=>          ::Delayed::Job.where(query).count
          end
        end

We're investigating ActiveSupport::Dependencies.interlock, Rails.application.executor.wrap and the changes made by config.cache_class = true. It seems like the class caching setting has a lot to do with this error. We've tried various combinations of wrapping DelayedJobSampler#poll with some of the suggestions made in this issue, with no avail.

Until we track this element down, we'd like to include the changes in the bugfix/delayed-job-deadlocks branch in our upcoming 8.1.0 release to slow down the deadlock. Ultimately, we hope to find something that eliminates this issue altogether.

Thanks for your continued engagement and support!

angelatan2 commented 2 years ago

Hello @jdelStrother, thank you for helping by providing a bare-bones app and log information for this issue. We truly appreciate it. We tried to resolve this issue for 8.1.0, but unfortunately, we were unable to do so. We will return to this issue when we can pick it up again. If it's becoming pressing, I encourage you to file an issue with New Relic Support. Thank you for your patience.

kaylareopelle commented 1 year ago

Closing this issue for now since it only happens in development and a fix to reduce the amount of time until a deadlock has been shipped. Please re-open if this comes up in a non-development environment.

hedgeye-fslezak commented 1 year ago

Hi - we’re in the midst of setting up New Relic on our environment and one of our applications in production appears to have a very similar issue as described above in this ticket on agent 8.8.0. We are running DelayedJob via rake jobs:work in a Docker container and will experience a deadlock generally within a few minutes (!) so long as New Relic is active.

Interestingly, we are also not seeing any of the successful jobs actually get submitted to the New Relic dashboard for the app. As seen above, there’s very little in logs, but our rb_backtrace() for the main thread looks very similar to what’s been reported already.

tannalynn commented 1 year ago

@hedgeye-fslezak Thanks for letting us know that you're encountering a similar issue to this one. For this particular issue, the deadlock was only seen in development environments, but was not seen in production due to differences in settings between prod/dev environments. Right now, I'm not sure if it's the same issue as this or just one that looks similar. One way this issue was able to be prevented was to disable the delayed job samplers (which caused the problem) by adding the following to the newrelic.yml:

# newrelic.yml 

  disable_delayed_job_sampler: true

If this is the same issue, setting that config should prevent the deadlock from occurring.

One thing that's interesting, is that I don't believe we saw the same behavior you describe about the successful jobs not appearing on the UI. Are you using infinite tracing? In 8.9.0 we released a bugfix for infinite tracing that would prevent data from being sent to the new relic servers. If you aren't using infinite tracing, that bugfix wouldn't be relevant to the situation though.

One thing to note is that data related to a transaction will not be sent until the transaction is complete. I have seen jobs in the past structured in such a way that it was just one transaction running indefinitely, which caused the agent to be unable to send the transaction data to the new relic servers. I'm not sure if that is relevant to your particular situation since it isn't common, but I wanted to bring it up because I have seen that before from someone using delayed job.

Enabling debug level logging for the agent and reviewing the logs may be able to provide additional information about what is going on behind the scenes in the agent.

# newrelic.yml 

  log_level: debug

One of the first things I'd check in the agent logs would be to look for any errors or warnings, and then lines related to communicating to the new relic server to see if there are any issues in the response from the collector, these tend to look like this

DEBUG : Sending TransactionEvent data to New Relic Service
DEBUG : Sampled 4 / 4 (100.0 %) requests this cycle, 59 / 59 (100.0 %) since startup
DEBUG : Sending 4 items to analytic_event_data
DEBUG : Sending request to <collector url> with POST
DEBUG : Received response, status: 202, encoding: ''

Hopefully some of this helps, please let me know if you have any questions.

hedgeye-fslezak commented 1 year ago

Hi! Thank you for the response, and the information about that config flag. We will be testing that more this week, rather than our current heavy-handed solution of just disabling the agent entirely on our worker nodes.

On our environment (staging), we did manage to get some information out of the debug log, most notably that the deadlock appears to happen on the 60_second_harvest event:

[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : EventLoop: Dispatching event '5_second_harvest' with 4 callback(s).
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending TransactionEvent data to New Relic Service
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sampled 0 / 0 (NaN %) requests this cycle, 0 / 0 (NaN %) since startup
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending CustomEvent data to New Relic Service
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending ErrorEvent data to New Relic Service
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending LogEvent data to New Relic Service
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending 7 items to log_event_data
[2022-09-19 16:30:24 -0400 b91b1743c2ff (135)] DEBUG : Sending request to <redacted> with POST
[2022-09-19 16:30:25 -0400 b91b1743c2ff (135)] DEBUG : Received response, status: 202, encoding: ''

[2022-09-19 16:30:29 -0400 b91b1743c2ff (135)] DEBUG : EventLoop: Dispatching event '60_second_harvest' with 1 callback(s).
[2022-09-19 16:30:29 -0400 b91b1743c2ff (135)] DEBUG : Sending data to New Relic Service

We did have a couple earlier runs of 60_second_harvest succeed in this case, which seemed to actually report what we expected:

[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : EventLoop: Dispatching event '60_second_harvest' with 1 callback(s).
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending data to New Relic Service
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending 51 items to metric_data
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending request to [redacted]  with POST
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Received response, status: 202, encoding: ''
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending 7 items to log_event_data
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending request to [redacted]  with POST
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Received response, status: 202, encoding: ''
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending request to [redacted] with POST
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Received response, status: 200, encoding: ''
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Received get_agent_commands = []
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : No thread profiles with data found to send.
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : EventLoop: Dispatching event 'reset_log_once_keys' with 1 callback(s).
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : EventLoop: Dispatching event '5_second_harvest' with 4 callback(s).
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending TransactionEvent data to New Relic Service
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sampled 0 / 0 (NaN %) requests this cycle, 0 / 0 (NaN %) since startup
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending CustomEvent data to New Relic Service
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending ErrorEvent data to New Relic Service
[2022-09-19 16:29:29 -0400 b91b1743c2ff (135)] DEBUG : Sending LogEvent data to New Relic Service

I'm still not exactly sure why jobs didn't report earlier as they are working on a separate test; I suspect it was just bad luck and the deadlock happened during submission (?). Hopefully the above can be of some value to the New Relic team as well. Thanks for your support!

workato-integration[bot] commented 1 year ago

https://issues.newrelic.com/browse/NEWRELIC-3750

kaylareopelle commented 1 year ago

@hedgeye-fslezak - Thanks for sharing your debug logs! We'll keep an eye out for your update on your test with the config flag toggle. 😄

hedgeye-fslezak commented 1 year ago

Hiya, update on our end. We managed to find our problem - it was indeed related to a stray config.cache_class = true caused by us running Spring in production (oops). Removing spring and ensuring cache_classes was disabled cleared the problem up and stopped the deadlocks, so that's at least a workable solution.

Thank you all for your support!

kaylareopelle commented 1 year ago

Fantastic! Thanks for the update! I'm glad you found a solution. 🎉

I'm going to close this ticket now, but please feel free to comment on this issue or create a new one if the problem returns.