alibaba / nacos

an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.
https://nacos.io
Apache License 2.0
30.29k stars 12.84k forks source link

服务注册成功,几分钟后会断开,没办法排查错误 #11831

Closed spongefather closed 7 months ago

spongefather commented 8 months ago

使用的 2.1.0 版本 nacos ,springcloud 使用的也是同样版本。docker compose 测试环境基本正常,开发机器测试向 docker compose 中的nacos 注册几分钟后就自动断开。

18:14:54.333 [main] INFO c.a.n.client.naming - [subscribe,141] - [SUBSCRIBE-SERVICE] service:iot-construction, group:DEFAULT_GROUP, clusters:DEFAULT 18:14:54.364 [main] INFO c.a.n.client.naming - [isChangedServiceInfo,181] - init new ips(1) service: DEFAULT_GROUP@@iot-construction@@DEFAULT -> [{"ip":"192.168.0.28","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:54.377 [main] INFO c.a.n.client.naming - [processServiceInfo,166] - current ips:(1) service: DEFAULT_GROUP@@iot-construction@@DEFAULT -> [{"ip":"192.168.0.28","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:54.384 [main] INFO o.a.c.h.Http11NioProtocol - [log,173] - Starting ProtocolHandler ["http-nio-9522"] 18:14:54.414 [main] INFO c.a.n.client.naming - [registerService,111] - [REGISTER-SERVICE] public registering service iot-construction with instance Instance{instanceId='null', ip='192.168.0.193', port=9522, weight=1.0, healthy=true, enabled=true, ephemeral=true, clusterName='DEFAULT', serviceName='null', metadata={preserved.register.source=SPRING_CLOUD, source=2024.03.13}} 18:14:54.427 [main] INFO c.a.c.n.r.NacosServiceRegistry - [register,75] - nacos registry, DEFAULT_GROUP iot-construction 192.168.0.193:9522 register finished 18:14:54.913 [nacos-grpc-client-executor-192.168.0.28-6] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Receive server push request, request = NotifySubscriberRequest, requestId = 113 18:14:54.913 [nacos-grpc-client-executor-192.168.0.28-6] INFO c.a.n.client.naming - [isChangedServiceInfo,181] - init new ips(2) service: DEFAULT_GROUP@@iot-construction -> [{"ip":"192.168.0.193","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD","source":"2024.03.13"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000},{"ip":"192.168.0.28","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:54.914 [nacos-grpc-client-executor-192.168.0.28-6] INFO c.a.n.client.naming - [processServiceInfo,166] - current ips:(2) service: DEFAULT_GROUP@@iot-construction -> [{"ip":"192.168.0.193","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD","source":"2024.03.13"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000},{"ip":"192.168.0.28","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:54.915 [nacos-grpc-client-executor-192.168.0.28-6] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Ack server push request, request = NotifySubscriberRequest, requestId = 113 18:14:55.357 [com.alibaba.nacos.client.naming.updater.0] INFO c.a.n.client.naming - [isChangedServiceInfo,234] - new ips(1) service: DEFAULT_GROUP@@iot-construction@@DEFAULT -> [{"ip":"192.168.0.193","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD","source":"2024.03.13"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:55.358 [com.alibaba.nacos.client.naming.updater.0] INFO c.a.n.client.naming - [processServiceInfo,166] - current ips:(2) service: DEFAULT_GROUP@@iot-construction@@DEFAULT -> [{"ip":"192.168.0.193","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRING_CLOUD","source":"2024.03.13"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000},{"ip":"192.168.0.28","port":9522,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@iot-construction","metadata":{"preserved.register.source":"SPRINGCLOUD"},"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000,"ipDeleteTimeout":30000}] 18:14:56.283 [main] INFO o.q.c.QuartzScheduler - [start,547] - Scheduler quartzScheduler$_NON_CLUSTERED started. 18:14:56.297 [main] INFO c.r.c.ReseaConstructionApplication - [logStarted,61] - Started ReseaConstructionApplication in 29.928 seconds (JVM running for 30.693) 18:14:56.307 [main] INFO c.a.n.c.c.i.CacheData - [,65] - nacos.cache.data.init.snapshot = true 18:14:56.309 [main] INFO c.a.n.c.c.i.ClientWorker - [addCacheDataIfAbsent,373] - [fixed-192.168.0.28_8848] [subscribe] iot-construction-dev.yml+DEFAULT_GROUP 18:14:56.311 [main] INFO c.a.n.c.c.i.CacheData - [addListener,180] - [fixed-192.168.0.28_8848] [add-listener] ok, tenant=, dataId=iot-construction-dev.yml, group=DEFAULT_GROUP, cnt=1 18:14:56.312 [main] INFO c.a.n.c.c.i.ClientWorker - [addCacheDataIfAbsent,373] - [fixed-192.168.0.28_8848] [subscribe] iot-construction+DEFAULT_GROUP 18:14:56.312 [main] INFO c.a.n.c.c.i.CacheData - [addListener,180] - [fixed-192.168.0.28_8848] [add-listener] ok, tenant=, dataId=iot-construction, group=DEFAULT_GROUP, cnt=1 18:14:56.312 [main] INFO c.a.n.c.c.i.ClientWorker - [addCacheDataIfAbsent,373] - [fixed-192.168.0.28_8848] [subscribe] iot-construction.yml+DEFAULT_GROUP 18:14:56.313 [main] INFO c.a.n.c.c.i.CacheData - [addListener,180] - [fixed-192.168.0.28_8848] [add-listener] ok, tenant=, dataId=iot-construction.yml, group=DEFAULT_GROUP, cnt=1

服务启动,运行一段时间出现:

18:15:13.437 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Server healthy check fail, currentConnection = 1710411290561_192.168.0.193_53026 18:15:13.438 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Try to reconnect to a new server, server is not appointed, will choose a random server. 18:15:13.632 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Success to connect a server [192.168.0.28:8848], connectionId = 1710411313678_192.168.0.193_53049 18:15:13.633 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Abandon prev connection, server is 192.168.0.28:8848, connectionId is 1710411290561_192.168.0.193_53026 18:15:13.634 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [closeConnection,592] - Close current connection 1710411290561_192.168.0.193_53026 18:15:13.639 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Notify disconnected event to listeners 18:15:13.639 [com.alibaba.nacos.client.remote.worker] WARN c.a.n.client.naming - [onDisConnect,82] - Grpc connection disconnect, mark to redo 18:15:13.640 [com.alibaba.nacos.client.remote.worker] WARN c.a.n.client.naming - [onDisConnect,89] - mark to redo completed 18:15:13.641 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Notify connected event to listeners. 18:15:13.642 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.client.naming - [onConnected,76] - Grpc connection connect 18:15:14.449 [com.alibaba.nacos.client.naming.grpc.redo.0] INFO c.a.n.client.naming - [redoForInstance,72] - Redo instance operation REGISTER for DEFAULT_GROUP@@iot-construction 18:15:14.558 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Server healthy check fail, currentConnection = 1710411275393_192.168.0.193_53010 18:15:14.558 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Try to reconnect to a new server, server is not appointed, will choose a random server. 18:15:15.883 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Success to connect a server [192.168.0.28:8848], connectionId = 1710411314863_192.168.0.193_53151 18:15:15.883 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Abandon prev connection, server is 192.168.0.28:8848, connectionId is 1710411275393_192.168.0.193_53010 18:15:15.883 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [closeConnection,592] - Close current connection 1710411275393_192.168.0.193_53010 18:15:15.884 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Notify disconnected event to listeners 18:15:15.884 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.c.i.ClientWorker - [onDisConnect,637] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] DisConnected,clear listen context... 18:15:15.884 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Notify connected event to listeners. 18:15:15.885 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.c.i.ClientWorker - [onConnected,630] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Connected,notify listen context... 18:15:17.463 [com.alibaba.nacos.client.naming.grpc.redo.0] ERROR c.a.n.c.r.client - [printIfErrorEnabled,99] - Send request fail, request = InstanceRequest{headers={}, requestId='null'}, retryTimes = 0, errorMessage = java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 10 milliseconds, 372800 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@44d46ba3[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] 18:15:17.466 [com.alibaba.nacos.client.naming.grpc.redo.0] ERROR c.a.n.client.naming - [redoForInstances,62] - Redo instance operation REGISTER for DEFAULT_GROUP@@iot-construction failed. com.alibaba.nacos.api.exception.NacosException: Request nacos server failed: at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.requestToServer(NamingGrpcClientProxy.java:288) at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.doRegisterService(NamingGrpcClientProxy.java:128) at com.alibaba.nacos.client.naming.remote.gprc.redo.RedoScheduledTask.redoForInstance(RedoScheduledTask.java:78) at com.alibaba.nacos.client.naming.remote.gprc.redo.RedoScheduledTask.redoForInstances(RedoScheduledTask.java:60) at com.alibaba.nacos.client.naming.remote.gprc.redo.RedoScheduledTask.run(RedoScheduledTask.java:50) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.alibaba.nacos.api.exception.NacosException: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 10 milliseconds, 372800 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@44d46ba3[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:78) at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:653) at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:631) at com.alibaba.nacos.client.naming.remote.gprc.NamingGrpcClientProxy.requestToServer(NamingGrpcClientProxy.java:278) ... 11 common frames omitted Caused by: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 10 milliseconds, 372800 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@44d46ba3[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:508) at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:76) ... 14 common frames omitted 18:15:17.467 [com.alibaba.nacos.client.naming.grpc.redo.0] INFO c.a.n.client.naming - [redoForSubscribe,110] - Redo subscriber operation REGISTER for DEFAULT_GROUP@@iot-construction#DEFAULT 18:15:18.898 [com.alibaba.nacos.client.Worker] ERROR c.a.n.c.r.client - [printIfErrorEnabled,99] - Send request fail, request = ConfigBatchListenRequest{headers={}, requestId='null'}, retryTimes = 0, errorMessage = java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 12 milliseconds, 266500 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@4ccfa512[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] 18:15:18.900 [com.alibaba.nacos.client.Worker] ERROR c.a.n.c.c.i.ClientWorker - [executeConfigListen,822] - Async listen config change error com.alibaba.nacos.api.exception.NacosException: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 12 milliseconds, 266500 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@4ccfa512[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:78) at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:653) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.requestProxy(ClientWorker.java:1005) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.requestProxy(ClientWorker.java:986) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.executeConfigListen(ClientWorker.java:777)
at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.lambda$startInternal$2(ClientWorker.java:695)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 12 milliseconds, 266500 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@4ccfa512[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:508) at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:76) ... 12 common frames omitted 18:15:20.475 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Try to reconnect to a new server, server is not appointed, will choose a random server. 18:15:20.475 [com.alibaba.nacos.client.naming.grpc.redo.0] INFO c.a.n.client.naming - [redoForInstance,72] - Redo instance operation REGISTER for DEFAULT_GROUP@@iot-construction 18:15:20.476 [com.alibaba.nacos.client.naming.grpc.redo.0] INFO c.a.n.client.naming - [redoForSubscribe,110] - Redo subscriber operation REGISTER for DEFAULT_GROUP@@iot-construction#DEFAULT 18:15:21.617 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Success to connect a server [192.168.0.28:8848], connectionId = 1710411320664_192.168.0.193_53155 18:15:21.618 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Abandon prev connection, server is 192.168.0.28:8848, connectionId is 1710411313678_192.168.0.193_53049 18:15:21.619 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [closeConnection,592] - Close current connection 1710411313678_192.168.0.193_53049 18:15:21.619 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Notify disconnected event to listeners 18:15:21.620 [com.alibaba.nacos.client.remote.worker] WARN c.a.n.client.naming - [onDisConnect,82] - Grpc connection disconnect, mark to redo 18:15:21.620 [com.alibaba.nacos.client.remote.worker] WARN c.a.n.client.naming - [onDisConnect,89] - mark to redo completed 18:15:21.620 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [d15118e5-05d3-4db7-9f3d-fb5355feca94] Notify connected event to listeners. 18:15:21.620 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.client.naming - [onConnected,76] - Grpc connection connect 18:15:21.906 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Try to reconnect to a new server, server is not appointed, will choose a random server. 18:15:23.044 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Success to connect a server [192.168.0.28:8848], connectionId = 1710411322098_192.168.0.193_53157 18:15:23.045 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Abandon prev connection, server is 192.168.0.28:8848, connectionId is 1710411314863_192.168.0.193_53151 18:15:23.045 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [closeConnection,592] - Close current connection 1710411314863_192.168.0.193_53151 18:15:23.046 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Notify disconnected event to listeners 18:15:23.046 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.c.i.ClientWorker - [onDisConnect,637] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] DisConnected,clear listen context... 18:15:23.046 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.r.client - [printIfInfoEnabled,60] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Notify connected event to listeners. 18:15:23.046 [com.alibaba.nacos.client.remote.worker] INFO c.a.n.c.c.i.ClientWorker - [onConnected,630] - [a290c981-2290-403b-a05f-e2d0ed6683cf_config-0] Connected,notify listen context... 18:15:23.478 [com.alibaba.nacos.client.naming.grpc.redo.0] INFO c.a.n.client.naming - [redoForInstance,72] - Redo instance operation REGISTER for DEFAULT_GROUP@@iot-construction 18:15:26.063 [com.alibaba.nacos.client.Worker] ERROR c.a.n.c.r.client - [printIfErrorEnabled,99] - Send request fail, request = ConfigBatchListenRequest{headers={}, requestId='null'}, retryTimes = 0, errorMessage = java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 13 milliseconds, 993200 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@71205248[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] 18:15:26.063 [com.alibaba.nacos.client.Worker] ERROR c.a.n.c.c.i.ClientWorker - [executeConfigListen,822] - Async listen config change error com.alibaba.nacos.api.exception.NacosException: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 13 milliseconds, 993200 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@71205248[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:78) at com.alibaba.nacos.common.remote.client.RpcClient.request(RpcClient.java:653) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.requestProxy(ClientWorker.java:1005) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.requestProxy(ClientWorker.java:986) at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.executeConfigListen(ClientWorker.java:777)
at com.alibaba.nacos.client.config.impl.ClientWorker$ConfigRpcTransportClient.lambda$startInternal$2(ClientWorker.java:695)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 13 milliseconds, 993200 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@71205248[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]] at com.alibaba.nacos.shaded.com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:508) at com.alibaba.nacos.common.remote.client.grpc.GrpcConnection.request(GrpcConnection.java:76) ... 12 common frames omitted 18:15:26.493 [com.alibaba.nacos.client.naming.grpc.redo.0] ERROR c.a.n.c.r.client - [printIfErrorEnabled,99] - Send request fail, request = InstanceRequest{headers={}, requestId='null'}, retryTimes = 0, errorMessage = java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 12 milliseconds, 681300 nanoseconds delay) for com.alibaba.nacos.shaded.io.grpc.stub.ClientCalls$GrpcFuture@3350daea[status=PENDING, info=[GrpcFuture{clientCall={delegate={delegate=ClientCallImpl{method=MethodDescriptor{fullMethodName=Request/request, type=UNARY, idempotent=false, safe=false, sampledToLocalTracing=true, requestMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@6a07cbba, responseMarshaller=com.alibaba.nacos.shaded.io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller@51e36752, schemaDescriptor=com.alibaba.nacos.api.grpc.auto.RequestGrpc$RequestMethodDescriptorSupplier@41235a71}}}}}]]

没有相关信息可以进行错误排查。wireshark 中 grpc 正常通信

KomachiSion commented 8 months ago

日志显示是超时了,超时无非3种情况:

  1. 网络问题,包发不出去或包接收不到,需要排查网络环境或者网络设置(比如MTU之类的)
  2. 客户端压力大,收到返回的数据包,但是没有资源进行处理,Future认为超时。
  3. 服务端压力大,处理慢导致客户端超时没收到包。
spongefather commented 7 months ago

可能与网络有关,但是重启会有几分钟的正常时间,也可能重连机制有问题

besthanliu commented 3 months ago

@KomachiSion 2. 客户端压力大,收到返回的数据包,但是没有资源进行处理,Future认为超时。 关于这一点是否有配置可以调大这个超时时间 java.util.concurrent.TimeoutException: Waited 3000 milliseconds (plus 181563 nanoseconds delay)

是否可以配置Future的线程数量