alibaba / nacos

an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.
https://nacos.io
Apache License 2.0
30.32k stars 12.85k forks source link

Nacos1.4.2 The service cannot be offline automatically 服务无法自动下线 #7226

Closed kuaile-zc closed 2 years ago

kuaile-zc commented 3 years ago

Describe the bug

Expected behavior A clear and concise description of what you expected to happen. 在K8s环境下部署,服务无法自动下线。控制面板会出现下面描述的情况 Services cannot be automatically offline when deployed in the K8s environment. The control panel appears as described below

图片

会探测到不健康然后过一会又恢复健康周而复始 It detects ill health and then comes back to health later and so on

已经监控请求发现并没有心跳请求发送到nacos服务端 The monitoring request found no heartbeat request sent to the NACOS server

查看日志并未发现报错 No error is found in the log

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

kuaile-zc commented 3 years ago

报错日志: 2021-11-12 16:43:32,405 INFO [AUTO-DELETE-IP] service: xxy-1110@@service-c, ip: {"instanceId":"10.0.0.75#18080#DEFAULT#xxy-1110@@service-c","ip":"10.0.0.75","port":18080,"weight":1.0,"healthy":false,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"xxy-1110@@service-c","metadata":{"preserved.register.source":"SPRING_CLOUD"},"lastBeat":1636705452710,"marked":false,"app":"unknown","instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"ipDeleteTimeout":30000}

2021-11-12 16:43:32,405 ERROR [IP-DEAD] failed to delete ip automatically, ip: {"instanceId":"10.0.0.75#18080#DEFAULT#xxy-1110@@service-c","ip":"10.0.0.75","port":18080,"weight":1.0,"healthy":false,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"xxy-1110@@service-c","metadata":{"preserved.register.source":"SPRING_CLOUD"},"lastBeat":1636705452710,"marked":false,"app":"unknown","instanceHeartBeatInterval":5000,"instanceHeartBeatTimeOut":15000,"ipDeleteTimeout":30000}, error: {}

java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED at org.apache.http.util.Asserts.check(Asserts.java:46) at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90) at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123) at org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:75) at org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:108) at org.apache.http.impl.nio.client.CloseableHttpAsyncClient.execute(CloseableHttpAsyncClient.java:92) at com.alibaba.nacos.common.http.client.request.DefaultAsyncHttpClientRequest.execute(DefaultAsyncHttpClientRequest.java:52) at com.alibaba.nacos.common.http.client.NacosAsyncRestTemplate.execute(NacosAsyncRestTemplate.java:364) at com.alibaba.nacos.common.http.client.NacosAsyncRestTemplate.delete(NacosAsyncRestTemplate.java:105) at com.alibaba.nacos.naming.misc.HttpClient.asyncHttpRequest(HttpClient.java:195) at com.alibaba.nacos.naming.misc.HttpClient.asyncHttpDelete(HttpClient.java:159) at com.alibaba.nacos.naming.healthcheck.ClientBeatCheckTask.deleteIp(ClientBeatCheckTask.java:144) at com.alibaba.nacos.naming.healthcheck.ClientBeatCheckTask.run(ClientBeatCheckTask.java:122) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-11-12 16:43:35,760 WARN [STATUS-SYNCHRONIZE] failed to request serviceStatus, remote server: csp-nacos-2.csp-nacos-nacos-hs.product-csp-csg.svc.cluster.local:8848

kuaile-zc commented 3 years ago

Maybe I/O resources are insufficient

onewe commented 2 years ago

Perhaps it's the network problem too.

zglu001 commented 2 years ago

This problem has been solved? How to solve it?

kuaile-zc commented 2 years ago

Failure to create HTTP is caused by insufficient resources。 We deployed statetFUKset NACOS on the K8S base to add probes and restart pods when resources ran low, but this meant introducing more problems ! There are many problems with cloud native(k8s) deployment.

kuaile-zc commented 2 years ago

This problem has been solved? How to solve it?

You should make sure you have the same problem as us!

stale[bot] commented 2 years ago

Thanks for your feedback and contribution. But the issue/pull request has not had recent activity more than 180 days. This issue/pull request will be closed if no further activity occurs 7 days later. We may solve this issue in new version. So can you upgrade to newest version and retry? If there are still issues or want to contribute again. Please create new issue or pull request again.

KANLON commented 2 years ago

我有相同的问题,请问解决了吗?有官方人员出来回答吗?