apache / dubbo

The java implementation of Apache Dubbo. An RPC and microservice framework.
https://dubbo.apache.org/
Apache License 2.0
40.51k stars 26.43k forks source link

[resubmit] fail to gracefully shutdown java dubbo app #11951

Open szhengli opened 1 year ago

szhengli commented 1 year ago

dubbo version: 3.0.9 error: 2023-03-28 17:47:33.403 ERROR 4212 --- [o-8082-exec-303] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.apache.dubbo.rpc.RpcException: Failed to invoke the method sayHello in the service org.example.api.DemoService. Tried 1 times of the providers [192.168.2.205:20880] (1/1) from the registry mse-4aac8ab0-zk.mse.aliyuncs.com:2181 on the consumer 192.168.2.206 using the dubbo version 3.0.9. Last error is: Failed to invoke remote method: sayHello, provider: dubbo://192.168.2.205:20880/org.example.api.DemoService anyhost=true&application=provider&background=false&category=providers,configurators,routers&check=false&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&interface=org.example.api.DemoService&methods=sayHello&pid=4212&qos.enable=false&release=3.0.9&retries=0&service-name-mapping=true&side=provider&sticky=false&timeout=30000000, cause: org.apache.dubbo.remoting.RemotingException: Fail to decode request due to: java.io.IOException: Service org.example.api.DemoService with version 0.0.0 not found, invocation rejected.

dubbo.properties: dubbo.service.shutdown.wait=600000 server.shutdown=graceful

background:

to shutdown jar app gracefully , initially ,run curl localhost:22222/offline && sleep 3 , and kill PID, meanwhile, the new requests arrive continuously, then the above errors happen, which cause some requests fail.

any way to aovid the error? thanks lot .

AlbumenJ commented 1 year ago

This is caused by the long push period of the registration center, and the waiting time can be extended

szhengli commented 1 year ago

doe "the waiting time" refers to dubbo.service.shutdown.wait? if that's case, it had been set to 600000, that is 10min, does the parameter need to extended larger? seems not matter how large the parameter is, the errors still happen. what does "the long push period " really mean, can you be more specific on this ? thank you very much.

AlbumenJ commented 1 year ago

curl localhost:22222/offline && sleep 3

change 3 to 60

"the long push period "

Provider unregister from registry, registry notify consumer

szhengli commented 1 year ago

if keep sending request to the app, if run (curl localhost:22222/offline && sleep 60
kill $(ps -aux | awk '/jav[a]/ {print $2} ' )), the above errors DO happen. i run a provider and a consumer on seperate VMs. I use jmeter to generate http request.

I also I highly suspect even the app completes offline comand, it may still able to receive the input request, though the errors happen. any help?

image