reinh / statsd

A Ruby Statsd client that isn't a direct port of the Python example code. Because Ruby isn't Python.
MIT License
411 stars 154 forks source link

add support for batching #18

Closed mattetti closed 12 years ago

mattetti commented 12 years ago

From Etsy's statsd documentation: "All metrics can also be batch send in a single UDP packet, separated by a newline character."

It would be great if one could start a batch session (within a request for instance) and flush it at the end which would only push a single UDP packet.

raggi commented 12 years ago

Yup, I've been thinking about this. I don't currently have a good way to discover the MSS of the path in pure ruby (ioctl constant availability is poor).

It's possible to assume that most statsd servers will be a LAN away so MTU is appropriate, but over the internet that's not always going to be the case either (MTU is also difficult to discover in pure ruby). Old thinking was that this leaves one to choose 576, but that's somewhat repulsive in a modern world (buffer bloat considerations aside).

It'll need to be configurable either way, but it would likely be nice to document a solution to use PMTUD, or at least make users more aware that these choices are significant.

raggi commented 12 years ago

Defaulting to ethernet is likely reasonable.

fcheung commented 12 years ago

Is it big deal if the packet gets fragmented?

mattetti commented 12 years ago

It's not a huge deal but it would be slightly better performance wise when you instrument the crap out of a process.

On Sep 6, 2012, at 2:16, Frederick Cheung notifications@github.com wrote:

Is it big deal if the packet gets fragmented?

\ Reply to this email directly or view it on GitHub.

fcheung commented 12 years ago

I meant is it a big deal if you send a single UDP packet (containing multiple metrics) such that the packet is too big to fit in a single IP datagram. Won't the fragments get reassembled anyway?

reinh commented 12 years ago

The other issue is that sending more, smaller packets amortizes delivery failure so that an individual failure is less impactful. Keep in mind that sending these packets takes literally microseconds on most hardware.

mattetti commented 12 years ago

That's fair, sending these packets can add up to a few precious milliseconds per requests. I understand that my suggestion might be a bad idea in some cases but since it's supported by statsd I think it might be nice to offer a cheaper instrumentation option even tho it means risking losing some samples.

On Sep 6, 2012, at 17:06, Rein Henrichs notifications@github.com wrote:

The other issue is that sending more, smaller packets amortizes delivery failure so that an individual failure is less impactful. Keep in mind that sending these packets takes literally microseconds on most hardware.

\ Reply to this email directly or view it on GitHub.

reinh commented 12 years ago

@mattetti I do like the idea. Let me spike something out.

reinh commented 12 years ago

Let's move discussion to #19