heroku / barnes

Tell StatsD about request time, GC, objects and more. Latest Rails 4 and Ruby 2.1 support, and ancient Rails 2 and Ruby 1.8 support.
MIT License
0 stars 0 forks source link

Incompatible with dogstatsd gem #19

Open robotmay opened 6 years ago

robotmay commented 6 years ago

I seem to be getting this error when running this gem alongside the dogstatsd one:

#<Thread:0x00007fb3dd809948@/app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/periodic.rb:39 run> terminated with exception (report_on_exception is true):

/app/vendor/bundle/ruby/2.5.0/gems/dogstatsd-ruby-1.6.0/lib/statsd.rb:339:in `send_stats': undefined method `[]' for 0.16666666666666666:Float (NoMethodError)

    from /app/vendor/bundle/ruby/2.5.0/gems/dogstatsd-ruby-1.6.0/lib/statsd.rb:137:in `count'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:53:in `block (2 levels) in report_statsd'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:52:in `each'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:52:in `block in report_statsd'

    from /app/vendor/bundle/ruby/2.5.0/gems/dogstatsd-ruby-1.6.0/lib/statsd.rb:292:in `batch'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:51:in `call'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:51:in `report_statsd'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/reporter.rb:47:in `report'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/periodic.rb:62:in `block (2 levels) in initialize'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/periodic.rb:46:in `loop'

    from /app/vendor/bundle/ruby/2.5.0/gems/barnes-0.0.7/lib/barnes/periodic.rb:46:in `block in initialize'

rake aborted!

Unfortunately I don't have any time to investigate it myself right now, but as the Ruby metrics are in beta I figured you guys might want to know :)

schneems commented 4 years ago

Are you still seeing this? Can you give me an example app that reproduces the behavior? https://www.codetriage.com/example_app

pcreux commented 4 years ago

:wave: It seems to be working on my end using the Statsd client to send metrics to the Datadog statsd server / agent.

In config/puma.rb:

before_fork do
  Barnes.start
  Barnes.start # send metrics to Heroku's statsd agent
  Barnes.start(statsd: Statsd.new('127.0.0.1', 8125)) # send metrics to Datadog's statsd agent
end
Screen Shot 2020-10-15 at 8 57 09 AM

Now, the one metric I'm interested in is puma pool usage - and I can't seem to find it. 🤔

schneems commented 4 years ago

I recommend using env vars tmake sure that it's running and sending metrics https://devcenter.heroku.com/articles/language-runtime-metrics-ruby#debugging

IanVaughan commented 3 years ago

I was seeing the same issue, if I remove dogstatsd-ruby gem it works fine.

But I did find that this works:

require 'datadog/statsd'

before_fork do
  Barnes.start(statsd: Datadog::Statsd.new('localhost', 8125))
end

PS. I see that Barnes has to run twice, once for Heroku and once for DataDog, as per your example:

before_fork do
  Barnes.start
  Barnes.start # send metrics to Heroku's statsd agent
  Barnes.start(statsd: Statsd.new('127.0.0.1', 8125)) # send metrics to Datadog's statsd agent
end

But what is the first uncommented line for?

pcreux commented 3 years ago

@IanVaughan You're right, it's a dup. It should be:

before_fork do
  Barnes.start # send metrics to Heroku's statsd agent
  Barnes.start(statsd: Statsd.new('127.0.0.1', 8125)) # send metrics to Datadog's statsd agent
end
pcreux commented 3 years ago

@schneems Yes, it's running. Do you calculate the puma pool usage out of Barnes data at Heroku? It doesn't stand out from this:

{
  "barnes.state": {
    "stopwatch": {
      "wall": 1561579468665.421,
      "cpu": 8387.169301
    },
    "ruby_gc": {
      "count": 10,
      "heap_allocated_pages": 2451,
      "heap_sorted_length": 3768,
      "heap_allocatable_pages": 1317,
      "heap_available_slots": 999027,
      "heap_live_slots": 930476,
      "heap_free_slots": 68551,
      "heap_final_slots": 0,
      "heap_marked_slots": 921621,
      "heap_eden_pages": 2451,
      "heap_tomb_pages": 0,
      "total_allocated_pages": 2451,
      "total_freed_pages": 0,
      "total_allocated_objects": 3068719,
      "total_freed_objects": 2138243,
      "malloc_increase_bytes": 2716264,
      "malloc_increase_bytes_limit": 30330547,
      "minor_gc_count": 6,
      "major_gc_count": 4,
      "remembered_wb_unprotected_objects": 11936,
      "remembered_wb_unprotected_objects_limit": 23872,
      "old_objects": 898986,
      "old_objects_limit": 1797974,
      "oldmalloc_increase_bytes": 23976536,
      "oldmalloc_increase_bytes_limit": 58823529
    }
  },
  "barnes.counters": {
    "Time.wall": 10096.75341796875,
    "Time.cpu": 11.80766799999947,
    "Time.idle": 10084.94574996875,
    "Time.pct.cpu": 0.1169451952643102,
    "Time.pct.idle": 99.88305480473569,
    "GC.count": 0,
    "GC.major_count": 0,
    "GC.minor_gc_count": 0
  },
  "barnes.gauges": {
    "using.puma": 1,
    "pool.capacity": 40,
    "threads.max": 40,
    "threads.spawned": 40,
    "Objects.TOTAL": 999027,
    "Objects.FREE": 68593,
    "Objects.T_OBJECT": 72287,
    "Objects.T_CLASS": 25522,
    "Objects.T_MODULE": 3083,
    "Objects.T_FLOAT": 14,
    "Objects.T_STRING": 401533,
    "Objects.T_REGEXP": 3051,
    "Objects.T_ARRAY": 78539,
    "Objects.T_HASH": 44655,
    "Objects.T_STRUCT": 2244,
    "Objects.T_BIGNUM": 148,
    "Objects.T_FILE": 13,
    "Objects.T_DATA": 44890,
    "Objects.T_MATCH": 78,
    "Objects.T_COMPLEX": 1,
    "Objects.T_RATIONAL": 765,
    "Objects.T_SYMBOL": 2470,
    "Objects.T_IMEMO": 241182,
    "Objects.T_ICLASS": 9959,
    "GC.total_allocated_objects": 9474,
    "GC.total_freed_objects": 36,
    "GC.count": 10,
    "GC.heap_allocated_pages": 2451,
    "GC.heap_sorted_length": 3768,
    "GC.heap_allocatable_pages": 1317,
    "GC.heap_available_slots": 999027,
    "GC.heap_live_slots": 930476,
    "GC.heap_free_slots": 68551,
    "GC.heap_final_slots": 0,
    "GC.heap_marked_slots": 921621,
    "GC.heap_eden_pages": 2451,
    "GC.heap_tomb_pages": 0,
    "GC.total_allocated_pages": 2451,
    "GC.total_freed_pages": 0,
    "GC.malloc_increase_bytes": 2716264,
    "GC.malloc_increase_bytes_limit": 30330547,
    "GC.minor_gc_count": 6,
    "GC.major_gc_count": 4,
    "GC.remembered_wb_unprotected_objects": 11936,
    "GC.remembered_wb_unprotected_objects_limit": 23872,
    "GC.old_objects": 898986,
    "GC.old_objects_limit": 1797974,
    "GC.oldmalloc_increase_bytes": 23976536,
    "GC.oldmalloc_increase_bytes_limit": 58823529
  }
}
schneems commented 3 years ago

We do:

  "barnes.gauges": {
    "using.puma": 1,
    "pool.capacity": 40,
    "threads.max": 40,

Basically once capacity hits zero your app has no more capability to handle any additional requests. We flip this logic on its head so instead of 0 capacity left, we report 100% utilization. We use the threads to determine the max available capacity.

You can either use the raw pool.capacity or invert it using threads.max which is what Heroku does (since all the rest of our metrics, higher values usually mean more utilization such as load and RAM graphs).

schneems commented 3 years ago

It's also worth re-mentioning that I don't have an example app that reproduces the problem described here

Are you still seeing this? Can you give me an example app that reproduces the behavior? https://www.codetriage.com/example_app

pcreux commented 3 years ago

@schneems Thank you for the extremely quick reply!

I tested it on an environment that receives very low traffic so pool.capacity was remaining at 40 – making me believe that it was a fixed value.


It's also worth re-mentioning that I don't have an example app that reproduces the problem described here

I'm not running into this problem on my end.