reinh / statsd

A Ruby Statsd client that isn't a direct port of the Python example code. Because Ruby isn't Python.
MIT License
411 stars 154 forks source link

sample_rate should only be supported for counters. #36

Closed alexdean closed 11 years ago

alexdean commented 11 years ago

I think the sample_rate argument should be removed from gauge, set, and time/timing. Allowing users to set a sample rate which statsd will simple ignore invites misunderstandings & confusion about what's actually being recorded & flushed to graphite.

https://github.com/etsy/statsd/blob/53aae0ad2cd33879d79a786e067a5e3fd2548e1d/stats.js#L167 Note that the 3rd field (the sample rate) is only acted on if the type is not ms, g, or s.

I will submit a PR if this is something you'll accept.

alexdean commented 11 years ago

OK, i see that this sample_rate is used by the library to determine if a metric should be sent to statsd. https://github.com/reinh/statsd/blob/master/lib/statsd.rb#L264

Maybe this could be a documentation improvement? It has caused me a fair bit of confusion.

  1. sample_rate will control how often the Ruby library will report the metric to statsd.
  2. for counters only, statsd will adjust the value flushed to graphite based on sample_rate. For all other types, statsd ignores the sample rate.
reinh commented 11 years ago

@alexdean We can add documentation if you think it would clear things up but the canonical source of this documentation is StatsD itself. ruby-statsd is just a dumb client.

alexdean commented 11 years ago

Sorry I'm slow to reply. I think my confusion stemmed partially from comments like this:

The statsd server then uses the sample_rate to correctly track the average timing for the stat.

https://github.com/reinh/statsd/blob/master/lib/statsd.rb#L212

The statsd server does not do this.

For most of its history, statsd did nothing at all with the sample rate for any type except counters. Since https://github.com/etsy/statsd/commit/cc12534e3b3bf5137a263e4e065876329e1a719c, statsd now adjusts the count metric for a timing based on the sample rate, but no adjustments are attempted for the other timing-based metrics like sum, mean, median, etc.

It makes sense to me that the client supports a sampling rate, but it should be clearer that this is a client-side-only operation in the case of gauges and sets. The server does not do any interpolation/adjustment of the reported values based on sample rate, except in the case of counters and timings. Does that seem reasonable?

Other references:

This may be more of an issue with statsd itself than with ruby-statsd, as the sampling rules seem non-obvious and easy to misunderstand. Interested what you think about that. If it's just me being thick-headed, oh well. Thanks for reading. :)

reinh commented 11 years ago

@alexdean Hey, thanks for looking into it! I agree that the current documentation needs to be fixed and could be improved as well. Would you be willing to take a stab at it?

alexdean commented 11 years ago

@reinh Absolutely. I will give it a try and submit a PR when I've got something.

reinh commented 11 years ago

@alexdean :heart: