Open JensRantil opened 10 years ago
I agree, this does deserve its own issue.
I'm not sure whether it is sensible to implement multiple-message-sending without examining the consequences quite carefully. At present, the java-statsd-client pays no attention to the MTU and assumes that each individual message falls well within it. Enhancing the library to support sending multiple messages at once would definitely require it to consider the MTU. Arguably it should do this anyway.
Furthermore, I'm not sure what the best API to support multiple updates would be -- should it automatically batch up messages and send them in frequent bursts, or should that decision be pushed to the callers, by changing the API to allow multiple messages of the same type to be sent in a single call (e.g. statsd.incrementCounters("foo", "bar", "baz")
or statsd.recordGaugeValues("foo", "bar", 100)
. This latter approach seems contrived.
Perhaps you have some examples of situations in which matching multiple updates would be appropriate?
"automatically batch up messages" sounds a good option. It will reduce the traffic and communication overhead between application and the server.
The trouble with batching up messages is that it introduces complexity and immediately poses a number of questions:
Answering these, and working out how batching might work generally, is going to take some careful consideration. I'd be delighted to hear your thoughts.
this has been implemented in https://github.com/finn-no/statsd-lmax-disruptor-client/commit/5d34f1591b0c8cf7729632586323001ab7962110#diff-d41d8cd98f00b204e9800998ecf8427e
(as well as also upgrading to NIO).
we've been using this in production at Norway's busiest website for 5 months now.
Addressing your questions:
What happens if insufficient messages are received to fill a batch?
In https://github.com/finn-no/statsd-lmax-disruptor-client/commit/5d34f1591b0c8cf7729632586323001ab7962110 the approach was to send the msg as soon as the queue was empty. That is all ready-to-send msgs would be collapsed into available MTU space, but it would never wait.
In https://github.com/finn-no/statsd-lmax-disruptor-client/commit/5b0e893bbd7cc2a968333b3007e294f7adea97f6#diff-d41d8cd98f00b204e9800998ecf8427e the same approach was taken. The lmax-disruptor passes in a "batchEnd" flag to the EventHandler.onEvent(..) method that indicates that there are no more available msgs to be collapsed.
How does it affect messages whose time of arrival is important?
The sending of messages is not delayed. In fact arrival time will be sooner rather than later. Otherwise this is normal behaviour for statsd.
How does it impact sampling rates?
We don't use sampling so i can't say for sure. But i can't see how sampling rates are effected.
How do we drain the current batch on shutdown?
There is no "batch" in-memory. As explained above the approach implemented will collapse messages and use MTU space efficiently if throughput/concurrency is high and the queue (or lmax-disruptor ringBuffer) goes above one. If you don't have this throughput and/or concurrency then i doubt there's any point in worrying about how effective you are with the MTU space. (You can always down-prioritise the background thread if you'd like to push towards less UDP packets being sent).
When we put this into production we saw the UDP packet loss drop dramatically.
Thanks for your comment @michaelsembwever -- I can see how this would work now, which brings me a step closer to implementing something.
any plan to move even closer to do multiple metrics sending?
This was briefly mentioned in #5 but deserves its own issue, I think.