DataDog / dd-trace-rb

Datadog Tracing Ruby Client
https://docs.datadoghq.com/tracing/
Other
299 stars 368 forks source link

ActiveRecord Connection Pool Stats #1656

Open sj26 opened 2 years ago

sj26 commented 2 years ago

ActiveRecord exports a bunch of metrics about connection pools:

https://github.com/rails/rails/blob/v6.1.4/activerecord/lib/active_record/connection_adapters/abstract/connection_pool.rb#L698-L714

This was also mentioned in issue #1259 but feels like a distinct request.

We'd really like the datadog tracer to be submitting these metrics to datadog as part of the active record instrumentation. I notice there's precedence for submitting runtime metrics to datadog now:

https://github.com/DataDog/dd-trace-rb/blob/master/lib/ddtrace/workers/runtime_metrics.rb

It'd be nice if integrations could also export metrics like this on a regular cadence, and then if that integrated these active record pool statistics.

We're currently working around this with the following:

ActiveSupport.on_load(:active_record) do
  # ActiveRecord has a bunch of connection pool metrics which we want to send
  # to datadog. Start a thread which submits the metrics every 10 seconds.
  statsd = Datadog::Statsd.new(namespace: "active_record.connection_pool")
  task = Concurrent::TimerTask.new(execution_interval: 10) do
    ActiveRecord::Base.connection_handler.all_connection_pools.each do |connection_pool|
      # Ideally we'd tag with role and shard here, too, but they're unavailable
      tags = { pid: Process.pid, name: connection_pool.db_config.name }

      connection_pool.stat.each do |name, value|
        statsd.gauge(name, value, tags: tags)
      end
    end
  end
  task.execute
end

This yields metrics like:

StatsD Metric: active_record.connection_pool.size 20|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.connections 0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.busy 0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.dead 0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.idle 0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.waiting 0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.checkout_timeout 5.0|g|#pid:60458,name:primary
StatsD Metric: active_record.connection_pool.size 20|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.connections 0|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.busy 0|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.dead 0|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.idle 0|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.waiting 0|g|#pid:60459,name:primary
StatsD Metric: active_record.connection_pool.checkout_timeout 5.0|g|#pid:60459,name:primary
neilchandler commented 1 year ago

My understanding of datadog billing is we have a set number of unique tags we can use per month, after we exceed our limit we get charged extra for the additional tags something like

tags = { pid: Process.pid, name: connection_pool.db_config.name }

will be very expensive as its effectively unlimited in terms of unique tags

sj26 commented 1 year ago

Correct, but this is minimum number of tags required to gather metrics for each pool — there is one per name and process. Without covering all pools you would get incomplete/inaccurate metrics.

However if this became an official datadog integration metric then it would not count as a custom metric and so wouldn't be charged, just like their other build-in integrations.

BobbyMcWho commented 1 year ago

This would be very valuable to have

trevorturk commented 1 year ago

+1 from me, I was hoping Datadog supported this and I'm sad to see that it doesn't (yet!)

tsrivishnu commented 9 months ago

Would love to see this supported!

taltcher commented 3 months ago

Would love to see this supported!