confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
87 stars 1.04k forks source link

Error scaling up KSQLDB to 2 instances in docker compose #9153

Open dberardo-com opened 2 years ago

dberardo-com commented 2 years ago

after scaling up my KSQLDB docker compose service to 2 replicas, i get this error showing up on both instances (containers):

WARNING: Thread Thread[vert.x-eventloop-thread-4,5,main]=Thread[vert.x-eventloop-thread-4,5,main] has been blocked for 83657 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
    at java.base@11.0.13/jdk.internal.misc.Unsafe.park(Native Method)
    at java.base@11.0.13/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234)
    at java.base@11.0.13/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2123)
    at java.base@11.0.13/java.util.concurrent.ArrayBlockingQueue.offer(ArrayBlockingQueue.java:393)
    at app//io.confluent.ksql.query.PullQueryQueue.doAcceptRow(PullQueryQueue.java:256)
    at app//io.confluent.ksql.query.PullQueryQueue.acceptRow(PullQueryQueue.java:239)
    at app//io.confluent.ksql.query.PullQueryQueue.acceptRows(PullQueryQueue.java:196)
    at app//io.confluent.ksql.execution.pull.HARouting.lambda$streamedRowsHandler$15(HARouting.java:465)
    at app//io.confluent.ksql.execution.pull.HARouting$$Lambda$2545/0x0000000800dea840.accept(Unknown Source)
    at app//io.confluent.ksql.rest.client.KsqlTarget.lambda$null$11(KsqlTarget.java:328)
    at app//io.confluent.ksql.rest.client.KsqlTarget$$Lambda$2551/0x0000000800e37840.handle(Unknown Source)
    at app//io.vertx.core.parsetools.impl.RecordParserImpl.handleParsing(RecordParserImpl.java:214)
    at app//io.vertx.core.parsetools.impl.RecordParserImpl.handle(RecordParserImpl.java:285)
    at app//io.vertx.core.parsetools.impl.RecordParserImpl.handle(RecordParserImpl.java:27)
    at app//io.vertx.core.http.impl.HttpClientResponseImpl.handleChunk(HttpClientResponseImpl.java:232)
    at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.lambda$beginResponse$0(Http1xClientConnection.java:486)
    at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl$$Lambda$1580/0x0000000800bf8c40.handle(Unknown Source)
    at app//io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:237)
    at app//io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:127)
    at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.handleChunk(Http1xClientConnection.java:322)
    at app//io.vertx.core.http.impl.Http1xClientConnection$StreamImpl.access$1900(Http1xClientConnection.java:242)
    at app//io.vertx.core.http.impl.Http1xClientConnection.handleResponseChunk(Http1xClientConnection.java:631)
    at app//io.vertx.core.http.impl.Http1xClientConnection.handleHttpMessage(Http1xClientConnection.java:601)
    at app//io.vertx.core.http.impl.Http1xClientConnection.handleMessage(Http1xClientConnection.java:577)
    at app//io.vertx.core.net.impl.VertxHandler$$Lambda$1554/0x0000000800bd0440.handle(Unknown Source)
    at app//io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366)
    at app//io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43)
    at app//io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:229)
    at app//io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:164)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at app//io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
    at app//io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
    at app//io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:311)
    at app//io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:432)
    at app//io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276)
    at app//io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at app//io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
    at app//io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
    at app//io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
    at app//io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
    at app//io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
    at app//io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
    at app//io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
    at app//io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
    at app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
    at app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
    at app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base@11.0.13/java.lang.Thread.run(Thread.java:829)

and also at the same time this error on pull queries appears:

io.confluent.ksql.api.server.KsqlApiException: Error starting pull query: Unable to execute pull query. [Partition 0 failed to find valid host. Hosts scanned: [8ad7275e5dfb:8088 was not selected because Host is not alive as of time 1653981493630, d287013ba0f8:8088 was not selected because Host is not the active host for this partition.], Partition 1 failed to find valid host. Hosts scanned: [8ad7275e5dfb:8088 was not selected because Host is not alive as of time 1653981493630]]
agavra commented 2 years ago

@dberardo-com can you give us more information on your setup/configurations?