Closed davis1214 closed 6 years ago
Hi @ujjboy , I am using the latest version (5.4.5), but when killing the server with kill -9, the client will still report the above error, it seems that this bug has not been fixed?
sorry, add the client's error:
ERROR 2018-09-10 14:17:13,895 http-nio-30001-exec-7 org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet]:181 - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.alipay.sofa.rpc.core.exception.SofaRouteException: RPC-02306: Cannot get the service address of service [com.abc.api.TestApi:1.0], please check the registry log. ] with root cause com.alipay.sofa.rpc.core.exception.SofaRouteException: RPC-02306: Cannot get the service address of service [com.abc.api.TestApi:1.0], please check the registry log. at com.alipay.sofa.rpc.client.AbstractCluster.noAvailableProviderException(AbstractCluster.java:418) ~[sofa-rpc-all-5.4.5.jar!/:5.4.5] at com.alipay.sofa.rpc.client.AbstractCluster.select(AbstractCluster.java:354) ~[sofa-rpc-all-5.4.5.jar!/:5.4.5] at com.alipay.sofa.rpc.client.FailoverCluster.doInvoke(FailoverCluster.java:64) ~[sofa-rpc-all-5.4.5.jar!/:5.4.5] at com.alipay.sofa.rpc.client.AbstractCluster.invoke(AbstractCluster.java:285) ~[sofa-rpc-all-5.4.5.jar!/:5.4.5] at com.alipay.sofa.rpc.client.ClientProxyInvoker.invoke(ClientProxyInvoker.java:83) ~[sofa-rpc-all-5.4.5.jar!/:5.4.5] at com.abc.api.TestApi_proxy_4.Test(TestApi_proxy_4.java) ~[?:?] at sun.reflect.GeneratedMethodAccessor149.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_71] at java.lang.reflect.Method.invoke(Method.java:497) ~[?:1.8.0_71] at com.alipay.sofa.runtime.service.binding.JvmBindingAdapter$JvmServiceInvoker.doInvoke(JvmBindingAdapter.java:183) ~[runtime-sofa-boot-starter-2.4.7.jar!/:2.4.7] at com.alipay.sofa.runtime.spi.service.ServiceProxy.invoke(ServiceProxy.java:39) ~[runtime-sofa-boot-starter-2.4.7.jar!/:2.4.7] at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) ~[spring-aop-4.3.4.RELEASE.jar!/:4.3.4.RELEASE] at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) ~[spring-aop-4.3.4.RELEASE.jar!/:4.3.4.RELEASE]
could u help me with it? tnks a lot. @ujjboy @lifesinger @popomore @khotyn
@lxl0044 you mean use zookeeper as registry, and kill -9
server and restart it ?
@ujjboy thks for resp so fast. yes, is there a problem with this operation?
No, it's normal case. Can you provide your demo for reproduce this ? and more logs in ~/logs/rpc
.
@ujjboy Have u ever had this problem in u'r tests?
Sorry, there are some other things in my demo that are not very convenient to upload, but the logic is like this:
##### properties #####
com.alipay.sofa.rpc.registry-address=zookeeper://127.0.0.1:2181
##### Client.java ##### `public class Client { @SofaReference(binding = @SofaReferenceBinding(bindingType = "bolt", timeout = 20000)) private TestApi testApi;
public String demo() {
return testApi.test();
}
}`
##### pom.xml #####
`
<artifactId>test-api</artifactId>
<version>0.0.1-SNAPSHOT</version>
`
##### TestApi.java #####
public interface TestApi { String test(); }
##### TestApiImpl.java #####
@SofaService(bindings = {@SofaServiceBinding(bindingType = "bolt", timeout = 20000)}) public class TestApiImpl implements TestApi { @Override public String test() { return "hello"; } }
============================================== Is this information enough? If not, I will try to extract the demo and upload it. thks a lot. :-)
@lxl0044 I need more logs in ~/logs/rpc
, such as rpc-client.log
& rpc-registry.log
.
@ujjboy hi, i upload my demo, link: https://github.com/lxl0044/test u can do like this: 1, run service; 2, run client, and call controller; 3, kill -9 ${service.pid} && restart it 4, continue calling client's controller, about some times later , it will happen;
any questions call me any time, thks a lot. :-)
Hi @ujjboy @lifesinger @popomore @khotyn , Dear all, any progress? did it happen? Anything i could do? thks a lot. :-)
@lxl0044 I will follow this case. Can you upload the log file ~/rpc/rpc-registry.log
?
@lxl0044
I tried your code and nothing happened,
something like that: 1, run service; 2, run client, and call controller; 3, kill -9 ${service.pid} && restart it 4, continue calling client's controller, about some times later , it will happen;
I just change
com:
alipay:
sofa:
rpc:
registry:
address: zookeeper://10.1.13.199:2181/xiao
to
com:
alipay:
sofa:
rpc:
registry:
address: zookeeper://localhost:2181
After I review the code, maybe there is one possibility: the old provider(killed server) deleted event is after new provider (restarted server) added event.
@lxl0044 I need your log in ~/rpc/rpc-registry.log
to confirm it.
Hi @ujjboy ,
I push them to my test project's logs
, u can see it and any needs ask me anytime, thks very much.
Dear @liangyuanpeng ,
this is my detailed steps:
1, run service && run client;
2, open a new window, run a while true
script, just like while true; do curl 'http://127.0.0.1:10001/client/test?param=1'; echo ; sleep 1; done
3, this script will running normal and print 1
every sec;
4, kill -9 ${service.pid}
, and you can see the script will response error;
5, then rerun service;
6, you can see the script back to normal;
7, but not good for a long time, some times later , this script will response error again;
you can try like it, i'm not sure what's wrong, but really error, so sad..
@lxl0044 I read this log, and as I guess, the remove event is later than the add event. I will new issue and fix it.
2018-09-14 09:24:33,451 INFO Curator-PathChildrenCache-2 com.alipay.sofa.rpc.registry.zk.ZookeeperConfigObserver - Receive add provider: path=[/xiao/sofa-rpc/com.xiao.test.api.SofaApi/providers/bolt%3A%2F%2F10.200.101.197%3A12205%3Fversion%3D1.0%26uniqueId%3D%26timeout%3D20000%26delay%3D-1%26id%3Drpc-cfg-0%26dynamic%3Dtrue%26weight%3D100%26accepts%3D100000%26startTime%3D1536888272713%26appName%3Dtest-sofa-service%26serialization%3Dhessian2%26pid%3D32634%26language%3Djava%26rpcVer%3D50405], data=[], stat=[133394535,133394535,1536888273175,1536888273175,0,0,0,72765356844127291,1,0,133394535
]
2018-09-14 09:24:52,428 INFO Curator-PathChildrenCache-2 com.alipay.sofa.rpc.registry.zk.ZookeeperConfigObserver - Receive remove provider: path=[/xiao/sofa-rpc/com.xiao.test.api.SofaApi/providers/bolt%3A%2F%2F10.200.101.197%3A12205%3Fversion%3D1.0%26uniqueId%3D%26timeout%3D20000%26delay%3D-1%26id%3Drpc-cfg-0%26dynamic%3Dtrue%26weight%3D100%26accepts%3D100000%26startTime%3D1536888195935%26appName%3Dtest-sofa-service%26serialization%3Dhessian2%26pid%3D32318%26language%3Djava%26rpcVer%3D50405], data=[], stat=[133394000,133394000,1536888196390,1536888196390,0,0,0,72765356844127289,1,0,133394000
]
@ujjboy Good, come on! Any progress call me, pls. Thks very much!! :-)
Describe the bug
服务(provider)非优雅方式重启(kill -9 $pid & startup)时,客户端再次调用,出现如下报错
ERROR 2018-08-17 10:12:22,439 http-nio-30001-exec-10 org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/].[dispatcherServlet]:181 - Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is com.alipay.sofa.rpc.core.exception.SofaRouteException: RPC-02306: 没有获得服务[com.choice.balance.api.BalanceApi:1.0]的调用地址,请检查服务是否已经推送 ] with root cause com.alipay.sofa.rpc.core.exception.SofaRouteException: RPC-02306: 没有获得服务[com.lotoblock.balance.api.BalanceApi:1.0]的调用地址,请检查服务是否已经推送 at com.alipay.sofa.rpc.client.AbstractCluster.noAvailableProviderException(AbstractCluster.java:418) ~[sofa-rpc-all-5.4.3.jar!/:5.4.3]
Expected behavior
期望暴力重启情况下,服务依然可以被调用
Actual behavior
在一定时间内(20秒内),kill -9 后直接重启,出现服务不可调用
原因
1、服务端zookeeker上生成的节点为
CreateMode.EPHEMERAL
类型,session中断时,客户端的会监听到remove时间,并删除客户端的对失效服务的链接。 2、服务端内存中同时维护每个providerInfo的信息,且key是一个对象。客户端接收到新的链接(provider)时,从内存先查,如果没有才创建。 3、在客户端没有remove失效的provider前,根据配置信息,会先查询到服务关闭之前的链接。即会出现此现象处理方式
1、服务优雅重启 2、修改代码,ProviderInfo,或者reconnect检查时,删除com.alipay.sofa.rpc.transport.bolt.BoltClientConnectionManager.urlConnectionMap中无效链接,并重新初始化。
Environment
java -version
): 1.8uname -a
): Linux