Closed clsaa closed 3 years ago
When you restart some nodes it is possible that:
Consequently, your system must be prepared for message loss if using send
(fire and forget). If you use request
(request/reply), you may implement a retry strategy when a timeout ReplyException
is received.
Hi, I use EventBus of VertX to realize the P2P communication between cluster nodes, but sometimes the following exceptions will be reported in the communication between nodes. This problem is particularly evident after the cluster is restarted, and it is common for several nodes in the cluster to fail to connect.
MyConfig:
EventBusOptions eventBusOptions = new EventBusOptions() .setHost(NetUtils.getInstanceIpWithCache()); VertxOptions options = new VertxOptions() //cluster .setClusterManager(clusterManager) //event bus .setEventBusOptions(eventBusOptions) //poolSize .setEventLoopPoolSize(vertxConfig.getEventLoopPoolSize()) .setWorkerPoolSize(vertxConfig.getWorkerPoolSize()) .setInternalBlockingPoolSize(vertxConfig.getInternalBlockingPoolSize()) //time .setWarningExceptionTime(vertxConfig.getWarningExceptionTimeInMillis()) .setWarningExceptionTimeUnit(TimeUnit.MILLISECONDS) .setBlockedThreadCheckInterval(vertxConfig.getBlockingIntervalInMillis()) .setBlockedThreadCheckIntervalUnit(TimeUnit.MILLISECONDS) .setMaxEventLoopExecuteTime(vertxConfig.getMaxEventLoopExecuteTimeInMillis()) .setMaxEventLoopExecuteTimeUnit(TimeUnit.MILLISECONDS) .setMaxWorkerExecuteTime(vertxConfig.getMaxWorkerExecuteTime()) .setMaxWorkerExecuteTimeUnit(TimeUnit.MILLISECONDS);
[vert.x-eventloop-thread-0] WARN s.i.v.c.eventbus.impl.clustered.ConnectionHolder - Connecting to server e28f77cf-4d6c-4847-b819-0a15559b32da failed io.netty.channel.AbstractChannel$AnnotatedConnectException: refuse connection: /33.5.70.81:37679 Caused by: java.net.ConnectException: refuse connection at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:852)
[vert.x-eventloop-thread-2] ERROR c.a.a.t.g.p.e.c.vertx.TaskStatusVertxEventSender - [GEI]-fail--TaskStatusVertxEventSender#recive#failed, taskId:GEI@test-module-sharding-cache-file-in-oss-export-code@210413200020@7228, sliceNo:1, cost:60062ms, traceId:null, params:[topic:GEI@ascp-tools@TASK_STATUS@33.5.70.81, eventClass:SliceStartEvent] shaded.io.vertx.core.eventbus.ReplyException: Timed out after waiting 60000(ms) for a reply. address: __vertx.reply.0a3b7ace-02f2-4e0e-bff0-103655eb2272, repliedAddress: GEI@ascp-tools@TASK_STATUS@33.5.70.81 at shaded.io.vertx.core.eventbus.impl.ReplyHandler.lambda$new$0(ReplyHandler.java:42) at shaded.io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:951) at shaded.io.vertx.core.impl.VertxImpl$InternalTimerHandler.handle(VertxImpl.java:918) at shaded.io.vertx.core.impl.EventLoopContext.emit(EventLoopContext.java:52) at shaded.io.vertx.core.impl.ContextImpl.emit(ContextImpl.java:294) at shaded.io.vertx.core.impl.EventLoopContext.emit(EventLoopContext.java:24) at shaded.io.vertx.core.impl.AbstractContext.emit(AbstractContext.java:49) at shaded.io.vertx.core.impl.EventLoopContext.emit(EventLoopContext.java:24) at shaded.io.vertx.core.impl.VertxImpl$InternalTimerHandler.run(VertxImpl.java:941) at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170) at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:500) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:852)