tim-group / java-statsd-client

a java statsd client library
Other
275 stars 139 forks source link

Consider using NIO libraries for UDP #20

Open scarytom opened 10 years ago

scarytom commented 10 years ago

On the dogstatsd fork of this project, some performance improvements were made in this pull request: https://github.com/indeedeng/java-dogstatsd-client/commit/5d34f1591b0c8cf7729632586323001ab7962110

Consider porting these changes into the java-statsd-client. I'm not actually sure that the NIO UDP client affords any benefit, so worth doing some reading first. The batching of messages so as to fit as many as possible into the MTU will relate to #15

scarytom commented 10 years ago

Also, take a look at https://github.com/finn-no/statsd-lmax-disruptor-client/commit/5b0e893bbd7cc2a968333b3007e294f7adea97f6 and evaluate the lmax-disruptor

scarytom commented 10 years ago

See also https://github.com/sgp/java-statsd-client/commit/fa0b0b1a90a7e81cd45065fc910be7a0b9a40955

michaelsembwever commented 10 years ago

Yes, this has been implemented in https://github.com/finn-no/statsd-lmax-disruptor-client/commit/5d34f1591b0c8cf7729632586323001ab7962110#diff-d41d8cd98f00b204e9800998ecf8427e

we've been using this in production at Norway's busiest website for 5 months now.

scarytom commented 10 years ago

I'm not sure I want to introduce a dependency on a 3rd party library like the lmax-disruptor, but moving to NIO, and batching per #15, does seem compelling.

aklossrbh commented 8 years ago

Under heavy stress testing, we noticed this synchronization in java.nio.DatagramChannel which pretty significantly reduced our throughput:

Aug  5 12:26:09 load1-services-0 java: "scheduling-akka.actor.default-dispatcher-92" prio=10 tid=0x00007f0a180e8800 nid=0x7e30 waiting for monitor entry [0x00007f09abf80000]
Aug  5 12:26:09 load1-services-0 java: java.lang.Thread.State: BLOCKED (on object monitor)
Aug  5 12:26:09 load1-services-0 java: at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:427)
Aug  5 12:26:09 load1-services-0 java: - waiting to lock <0x000000078151a408> (a java.lang.Object)
Aug  5 12:26:09 load1-services-0 java: at com.timgroup.statsd.BlockingStatsDClient.send(BlockingStatsDClient.java:527)
Aug  5 12:26:09 load1-services-0 java: at com.timgroup.statsd.BlockingStatsDClient.recordExecutionTime(BlockingStatsDClient.java:358)
Aug  5 12:26:09 load1-services-0 java: at com.redbrickhealth.api.util.metrics.reporter.StatsDMetricReporter$$anonfun$reportEnd$1.apply(StatsDMetricReporter.scala:127)
Aug  5 12:26:09 load1-services-0 java: at com.redbrickhealth.api.util.metrics.reporter.StatsDMetricReporter$$anonfun$reportEnd$1.apply(StatsDMetricReporter.scala:115)
Aug  5 12:26:09 load1-services-0 java: at scala.Option.foreach(Option.scala:257)
Aug  5 12:26:09 load1-services-0 java: at com.redbrickhealth.api.util.metrics.reporter.StatsDMetricReporter.reportEnd(StatsDMetricReporter.scala:115)

We're still evaluating options, but I'm curious if PRs for this are at all likely to be accepted, or should we just enjoy life with a forked driver?