Closed alexdean closed 11 years ago
OK, i see that this sample_rate
is used by the library to determine if a metric should be sent to statsd. https://github.com/reinh/statsd/blob/master/lib/statsd.rb#L264
Maybe this could be a documentation improvement? It has caused me a fair bit of confusion.
sample_rate
will control how often the Ruby library will report the metric to statsd.sample_rate
. For all other types, statsd ignores the sample rate.@alexdean We can add documentation if you think it would clear things up but the canonical source of this documentation is StatsD itself. ruby-statsd is just a dumb client.
Sorry I'm slow to reply. I think my confusion stemmed partially from comments like this:
The statsd server then uses the sample_rate to correctly track the average timing for the stat.
https://github.com/reinh/statsd/blob/master/lib/statsd.rb#L212
The statsd server does not do this.
For most of its history, statsd did nothing at all with the sample rate for any type except counters. Since https://github.com/etsy/statsd/commit/cc12534e3b3bf5137a263e4e065876329e1a719c, statsd now adjusts the count
metric for a timing based on the sample rate, but no adjustments are attempted for the other timing-based metrics like sum
, mean
, median
, etc.
It makes sense to me that the client supports a sampling rate, but it should be clearer that this is a client-side-only operation in the case of gauges and sets. The server does not do any interpolation/adjustment of the reported values based on sample rate, except in the case of counters and timings. Does that seem reasonable?
Other references:
This may be more of an issue with statsd itself than with ruby-statsd, as the sampling rules seem non-obvious and easy to misunderstand. Interested what you think about that. If it's just me being thick-headed, oh well. Thanks for reading. :)
@alexdean Hey, thanks for looking into it! I agree that the current documentation needs to be fixed and could be improved as well. Would you be willing to take a stab at it?
@reinh Absolutely. I will give it a try and submit a PR when I've got something.
@alexdean :heart:
I think the
sample_rate
argument should be removed fromgauge
,set
, andtime
/timing
. Allowing users to set a sample rate which statsd will simple ignore invites misunderstandings & confusion about what's actually being recorded & flushed to graphite.https://github.com/etsy/statsd/blob/53aae0ad2cd33879d79a786e067a5e3fd2548e1d/stats.js#L167 Note that the 3rd field (the sample rate) is only acted on if the type is not
ms
,g
, ors
.I will submit a PR if this is something you'll accept.