instaclustr / cassandra-exporter

Java agent for exporting Cassandra metrics to Prometheus
Apache License 2.0
73 stars 46 forks source link

Buffer overflow in nio exposition #83

Open eperott opened 4 years ago

eperott commented 4 years ago

The change to the Netty ChunkedNioStream introduced a regression.

This happens because the ChunkedNioStream will not flush the buffer unless it has reached the actual chunk limit, instead it will just make another call to ReadableByteChannel.read().

WARN  [prometheus-netty-pool-0] 2020-03-31 14:48:24,928 Slf4JLogger.java:151 - An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.nio.BufferOverflowException: null
    at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:189) ~[na:1.8.0_201]
    at java.nio.ByteBuffer.put(ByteBuffer.java:859) ~[na:1.8.0_201]
    at com.zegelin.prometheus.exposition.NioExpositionSink.writeAscii(NioExpositionSink.java:34) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.writeMetric(TextFormatMetricFamilyWriter.java:136) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.lambda$visit$1(TextFormatMetricFamilyWriter.java:167) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter$MetricVisitor.lambda$metricWriter$0(TextFormatMetricFamilyWriter.java:155) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.text.TextFormatMetricFamilyWriter.writeMetric(TextFormatMetricFamilyWriter.java:227) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.text.TextFormatExposition.nextSlice(TextFormatExposition.java:81) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.prometheus.exposition.FormattedByteChannel.read(FormattedByteChannel.java:24) ~[cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at io.netty.handler.stream.ChunkedNioStream.readChunk(ChunkedNioStream.java:107) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.stream.ChunkedNioStream.readChunk(ChunkedNioStream.java:29) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.http.HttpChunkedInput.readChunk(HttpChunkedInput.java:95) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.http.HttpChunkedInput.readChunk(HttpChunkedInput.java:42) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.stream.ChunkedWriteHandler.doFlush(ChunkedWriteHandler.java:225) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.stream.ChunkedWriteHandler.flush(ChunkedWriteHandler.java:139) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeFlush0(AbstractChannelHandlerContext.java:771) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:797) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:808) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:789) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:825) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at com.zegelin.cassandra.exporter.netty.HttpHandler.sendMetrics(HttpHandler.java:303) [cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.cassandra.exporter.netty.HttpHandler.channelRead0(HttpHandler.java:94) [cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at com.zegelin.cassandra.exporter.netty.HttpHandler.channelRead0(HttpHandler.java:39) [cassandra-exporter-agent-0.9.11-SNAPSHOT.jar:na]
    at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:435) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:250) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131) [netty-all-4.0.44.Final.jar:4.0.44.Final]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_201]
eperott commented 4 years ago

I'm working on a fix for this.

kopetsch commented 1 year ago

Your exporter is working for two weeks without any issues. Since today the BufferOverflow occurs permanently. So, I am sorry but I must install an other exporter tomorrow. Best regards

sonman commented 1 year ago

One Cluster works and another Cluster with same Configuration does not. Exporter seems to be not reliable. Cassandra 4.1.2 Exporter 0.9.12

kopetsch commented 1 year ago

same as by me. Meanwhile I am testing alternatives.

sonman commented 1 year ago

After new data (new tables) on one of the clusters (with a working exporter) the exporter now also fails with the described message above. So it seems that the error is triggered only when (much?) data is on the cluster.

sonman commented 1 year ago

The release from edgelaborities has fixed the issue for me. (AFAIK because they just merged https://github.com/instaclustr/cassandra-exporter/pull/84)

lunarfs commented 3 months ago

Hi, so what would it take to get this merged? We have sucesfully been running https://github.com/edgelaboratories/cassandra-exporter which essentially is 0.9.12 with this. so I realy want to upgrade to 0.9.14 (or later) and seen from my chair the best way is to get off the fork https://github.com/edgelaboratories/cassandra-exporter and onto instacluster again.. but this nio thingy prevents me from doing that... I can alos do a fork myself.. but I realy think getting the instacluster version not dying with the nio thingy is the best approach. The plan is to test this on cassandra 4.1.6. Please let me know what you think