Closed mindkir closed 7 years ago
Can you please share the configuration of the outputwriter?
I suspect that you combined the host and port (telegraf:8125
) in the same config element host
instead of using the config element port
for the port.
Then we have a bug/fragility in the StatsDOutputWriter that does not handle properly this configuration issue and causes BufferOverflowException.
<outputWriter class="org.jmxtrans.agent.StatsDOutputWriter">
<host>telegraf</host>
<port>8125</port>
...
</outputWriter>
Here is output writer configuration:
<outputWriter class="org.jmxtrans.agent.StatsDOutputWriter">
<host>telegraf</host>
<port>8125</port>
<metricName>jmxtransagents.${env.host.name:node}.${env.service.name:service}.${env.app.name:appname}.${envtype:poc}</metricName>
</outputWriter>
Just to clarify this is not a configuration issue. Agent was running fine for few days without restart and started failing with BufferOverflowException after some IOExceptions happened.
@cyrille-leclerc confirmed that this version is recovering from IO exceptions and does not experience BufferOverflowException.
Test scenario:
Started some JVM with jmxtrans-agent enabled
$ ./jdk1.8.0_121/bin/java -javaagent:jmxtrans-agent-1.2.6-20170316.180546-3.jar=jmxtrans-agent.xml -Djmxtrans.agent.premain.delay=10 -Dsun.net.inetaddr.ttl=60 EchoServer 4444
Added firewall rule to drop DNS packets and cause java.nio.channels.UnresolvedAddressException failures
$ iptables -A OUTPUT -p udp -s $SOURCE_IP --sport 1024:65535 --dport 53 -m state --state NEW,ESTABLISHED -j DROP
Checked for BufferOverflowException and monitored java.nio.HeapByteBuffer state. With new version it is stable and doesn't end up with BufferOverflowException.
2017-03-17 12:46:29.03 SEVERE [jmxtrans-agent-1] org.jmxtrans.agent.StatsDOutputWriter - Could not send stat java.nio.HeapByteBuffer[pos=0 lim=978 cap=1024] to host telegraf:8125 java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:429) at org.jmxtrans.agent.StatsDOutputWriter.flush(StatsDOutputWriter.java:158) at org.jmxtrans.agent.StatsDOutputWriter.doSend(StatsDOutputWriter.java:129) at org.jmxtrans.agent.StatsDOutputWriter.writeQueryResult(StatsDOutputWriter.java:119) at org.jmxtrans.agent.OutputWriterCircuitBreakerDecorator.writeQueryResult(OutputWriterCircuitBreakerDecorator.java:90) at org.jmxtrans.agent.Query.processAttributeValue(Query.java:274) at org.jmxtrans.agent.Query.collectAndExportAttribute(Query.java:216) at org.jmxtrans.agent.Query.collectAndExportForObjectName(Query.java:173) at org.jmxtrans.agent.Query.collectAndExport(Query.java:167) at org.jmxtrans.agent.TimeTrackingCollector.collectIfEnoughTimeHasPassed(TimeTrackingCollector.java:54) at org.jmxtrans.agent.JmxTransExporter.collectAndExport(JmxTransExporter.java:209) at org.jmxtrans.agent.JmxTransExporter$2.run(JmxTransExporter.java:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2017-03-17 12:46:29.031 SEVERE [jmxtrans-agent-1] org.jmxtrans.agent.StatsDOutputWriter - Could not send stat java.nio.HeapByteBuffer[pos=0 lim=1006 cap=1024] to host telegraf:8125 java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) at sun.nio.ch.DatagramChannelImpl.send(DatagramChannelImpl.java:429) at org.jmxtrans.agent.StatsDOutputWriter.flush(StatsDOutputWriter.java:158) at org.jmxtrans.agent.StatsDOutputWriter.postCollect(StatsDOutputWriter.java:104) at org.jmxtrans.agent.OutputWriterCircuitBreakerDecorator.postCollect(OutputWriterCircuitBreakerDecorator.java:124) at org.jmxtrans.agent.JmxTransExporter.collectAndExport(JmxTransExporter.java:214) at org.jmxtrans.agent.JmxTransExporter$2.run(JmxTransExporter.java:131) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)
Removed firewall rule and verified jmxtrans-agent recovered from failures, which it did.
@cyrille-leclerc Are there plans for making an official release with this fix in it? We are seeing this issue a lot and it would be good to have an official release rather than us relying on a random snapshot build. This has been really useful for us, so keep up the good work! 🥇
Released in 1.2.6
jmxtrans-agent-1.2.4.jar StatsDOutputWriter is indefinitelly failing with "java.nio.BufferOverflowException". It seems error was triggered by random failed DNS resolution, that caused some buffer (lim=91) to fill up.