Closed unixninja92 closed 8 years ago
Hello @unixninja92 can you detail more please? What do you mean by "crash", I suspect that it's not killing the JVM, I suspect that you mean that it does not recover. Do you have log messages, can you share the configuration details?
right, it's not recovering. Here is the stacktrace:
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter.postCollect(GraphitePlainTextTcpOutputWriter.java:140)
at org.jmxtrans.agent.OutputWriterCircuitBreakerDecorator.postCollect(OutputWriterCircuitBreakerDecorator.java:124)
at org.jmxtrans.agent.JmxTransExporter.collectAndExport(JmxTransExporter.java:175)
at org.jmxtrans.agent.JmxTransExporter$2.run(JmxTransExporter.java:85)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-09 14:27:20.86 WARNING [jmxtrans-agent-1] org.jmxtrans.agent.JmxTransExporter - Ignore exception flushing metrics
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at java.net.SocketOutputStream.socketWrite0(Native Method)
2016-05-09 14:27:20.858 WARNING [jmxtrans-agent-1] org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter - Exception flushing the stream to HostAndPort{host='influxdb.DOMAIN.com', port=2003}
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.jmxtrans.agent.GraphitePlainTextTcpOutputWriter.postCollect(GraphitePlainTextTcpOutputWriter.java:140)
at org.jmxtrans.agent.OutputWriterCircuitBreakerDecorator.postCollect(OutputWriterCircuitBreakerDecorator.java:124)
at org.jmxtrans.agent.JmxTransExporter.collectAndExport(JmxTransExporter.java:175)
at org.jmxtrans.agent.JmxTransExporter$2.run(JmxTransExporter.java:85)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
relevant config (minus queries section):
`
<port>2003</port>
<namePrefix>{{Service}}.{{Environment}}.{{VPC}}.${HOSTNAME}.</namePrefix>
</outputWriter>
<collectIntervalInSeconds>10</collectIntervalInSeconds>`
@unixninja92 You are missing the top of the stack trace, could you please post the entire trace?
Also, when you say it is not recovering - what is logged after this error occurs?
Would it be possible to attach the complete log file?
@kerlandsson I'm getting this log from aws cloudwatch, I'll see if I can find the top. And after, it just stops logging from what I can tell.
The line in the log just above the stack trace is:
09 May 2016 14:27:20.564 INFO : Authentication Exception: Failed login attempt for user 'XXXXXX' (Invalid username or password) from XXX.XXX.XXX.XXX
but that's from our program, not from jmxtrans-agent. There is a blank log line (contains only a space) right above that though, so maybe that's where something is supposed to be? But there isn't anything above that for at least 10s that would have anything to do with jmxtrans-agent.
The other instances of the same service all running jmxtrans-agent don't have any stacktraces from jmxtrans-agent, they just stopped reporting to influx.
I did some tests on my end and the GraphitePlainTextTcpOutputWriter
reconnects as expected for me. I even added an automatic test to verify it reconnects.
From your stack trace it seems you are not using one of the more recent versions. What version are you using?
I suggest you upgrade to the latest release of jmxtrans-agent. If you still suffer this problem, try enabling trace logging as detailed here: https://github.com/jmxtrans/jmxtrans-agent/wiki/Troubleshooting#enabling-logs and we can investigate further.
We are running 1.2.3 and 1.2.2.
I'm going to turn on logging and see if that helps.
So it turns out that jmxagent-trans is not the issue here, the logs show it continues to send data to influx. That data just gets lost somewhere between jmxagent-trans and influx. Let me know if you have any suggestions on things to try. The reason I assumed it was jmxagent-trans at first was because restarting the instance the java service was running on fixed the issue.
I have jmxtrans-agent sending graphite to influxdb, but if the influxdb server ever goes away or is switched out, jmxtrans-agent crashes and has to be restarted to start sending data again. It would be nice if it tried reconnecting and possibly even buffering data until it could send graphite data again.