apache / dubbo

The java implementation of Apache Dubbo. An RPC and microservice framework.
https://dubbo.apache.org/
Apache License 2.0
40.35k stars 26.4k forks source link

dubbo3.2+nacos2.1.2, consumer failed to get metadata -> throw RPCExpection #12155

Closed QAQdev closed 1 year ago

QAQdev commented 1 year ago
image image

provider可以在正常在nacos上显示注册信息,consumer启动后则一直获取不到metadata,希望得到解答!

songxiaosheng commented 1 year ago
image image

provider可以在正常在nacos上显示注册信息,consumer启动后则一直获取不到metadata,希望得到解答!

This is a timeout issue. Is there any network connectivity? You can try telnet IP port

QAQdev commented 1 year ago

Yes, I tried ping 10.251.1.1 and telnet 10.251.1.1 20880 and both of them show the network is connective. I think 10.251.1.1 should be the address that provider runs at?

songxiaosheng commented 1 year ago

yes, I tried ping 10.251.1.1 and telnet 10.251.1.1 20880 and both of them show the network is connective. I think 10.251.1.1 should be the address that provider runs at?

Yes, this metadata service is an internal service provided by the provider

QAQdev commented 1 year ago

So where may the issue lies? I tried many times and the consumer is always timeout. This is the detail of the console log:

2023-04-21 16:46:32.998  INFO 3810 --- [ternal.notifier] o.a.d.r.client.ServiceDiscoveryRegistry  :  [DUBBO] Trying to subscribe from apps dubbo-nacos-demo-provider for service key dubbo.nacos.demo.DemoService:1.0.0, , dubbo version: 3.2.0, current host: 10.251.1.1
2023-04-21 16:47:02.938  WARN 3810 --- [           main] o.a.d.r.proxy.InvokerInvocationHandler   :  [DUBBO] [Dubbo-Consumer] execute service dubbo-nacos-demo-provider/org.apache.dubbo.metadata.MetadataService:1.0.0:dubbo#getMetadataInfo cost 30043.098750 ms, this invocation almost (maybe already) timeout. Timeout: 30000ms
invocation context:
path=org.apache.dubbo.metadata.MetadataService;
version=1.0.0;
timeout=30000;
group=dubbo-nacos-demo-provider;
thread info: 
Start time: 3160144039375
+-[ Offset: 0.000000ms; Usage: 30043.098750ms, 100% ] Receive request. Client invoke begin. ServiceKey: dubbo-nacos-demo-provider/org.apache.dubbo.metadata.MetadataService:1.0.0 MethodName:getMetadataInfo, dubbo version: 3.2.0, current host: 10.251.1.1, error code: 3-4. This may be caused by , go to https://dubbo.apache.org/faq/3/4 to find instructions. 
2023-04-21 16:47:02.938  INFO 3810 --- [           main] o.a.d.r.transport.netty4.NettyChannel    :  [DUBBO] Close netty channel [id: 0xa8cdc6e2, L:/10.251.1.1:53568 - R:/10.251.1.1:20881], dubbo version: 3.2.0, current host: 10.251.1.1
2023-04-21 16:47:02.947 ERROR 3810 --- [           main] o.a.d.r.client.metadata.MetadataUtils    :  [DUBBO] Failed to get app metadata for revision f743357fb13b2e782837efbb8809eedc for type local from instance 10.251.1.1:20880, dubbo version: 3.2.0, current host: 10.251.1.1, error code: 1-39. This may be caused by , go to https://dubbo.apache.org/faq/1/39 to find instructions. 

Then consumer throws RPCExecption indicating that invocation is timeout

songxiaosheng commented 1 year ago

So where may the issue lies? I tried many times and the consumer is always timeout. This is the detail of the console log:

2023-04-21 16:46:32.998  INFO 3810 --- [ternal.notifier] o.a.d.r.client.ServiceDiscoveryRegistry  :  [DUBBO] Trying to subscribe from apps dubbo-nacos-demo-provider for service key dubbo.nacos.demo.DemoService:1.0.0, , dubbo version: 3.2.0, current host: 10.251.1.1
2023-04-21 16:47:02.938  WARN 3810 --- [           main] o.a.d.r.proxy.InvokerInvocationHandler   :  [DUBBO] [Dubbo-Consumer] execute service dubbo-nacos-demo-provider/org.apache.dubbo.metadata.MetadataService:1.0.0:dubbo#getMetadataInfo cost 30043.098750 ms, this invocation almost (maybe already) timeout. Timeout: 30000ms
invocation context:
path=org.apache.dubbo.metadata.MetadataService;
version=1.0.0;
timeout=30000;
group=dubbo-nacos-demo-provider;
thread info: 
Start time: 3160144039375
+-[ Offset: 0.000000ms; Usage: 30043.098750ms, 100% ] Receive request. Client invoke begin. ServiceKey: dubbo-nacos-demo-provider/org.apache.dubbo.metadata.MetadataService:1.0.0 MethodName:getMetadataInfo, dubbo version: 3.2.0, current host: 10.251.1.1, error code: 3-4. This may be caused by , go to https://dubbo.apache.org/faq/3/4 to find instructions. 
2023-04-21 16:47:02.938  INFO 3810 --- [           main] o.a.d.r.transport.netty4.NettyChannel    :  [DUBBO] Close netty channel [id: 0xa8cdc6e2, L:/10.251.1.1:53568 - R:/10.251.1.1:20881], dubbo version: 3.2.0, current host: 10.251.1.1
2023-04-21 16:47:02.947 ERROR 3810 --- [           main] o.a.d.r.client.metadata.MetadataUtils    :  [DUBBO] Failed to get app metadata for revision f743357fb13b2e782837efbb8809eedc for type local from instance 10.251.1.1:20880, dubbo version: 3.2.0, current host: 10.251.1.1, error code: 1-39. This may be caused by , go to https://dubbo.apache.org/faq/1/39 to find instructions. 

Then consumer throws RPCExecption indicating that invocation is timeout

Confirm if the provider has received the request

QAQdev commented 1 year ago

I refer to https://cn.dubbo.apache.org/zh-cn/overview/mannual/java-sdk/reference-manual/metadata-center/nacos/ and configure metadata-type: remote, then the cosumer doesn't throw RPCExpection.

I make cosumer a RestController and test it through 127.0.0.1/hello

But I notice that there is no log (i.e. request info you mentioned) at provider side, while consumer side's log is:

invocation context:
path=dubbo.nacos.demo.DemoService;
remote.application=dubbo-nacos-demo-consumer;
interface=dubbo.nacos.demo.DemoService;
version=1.0.0;
timeout=3000;
thread info: 
Start time: 6058177882625
+-[ Offset: 0.000000ms; Usage: 9082.774666ms, 100% ] Receive request. Client invoke begin. ServiceKey: dubbo.nacos.demo.DemoService:1.0.0 MethodName:sayHello
  +-[ Offset: 5.083916ms; Usage: 3017.671667ms, 33% ] Invoker invoke. Target Address: 10.251.1.1:20880
  +-[ Offset: 3023.084250ms; Usage: 3030.060500ms, 33% ] Invoker invoke. Target Address: 10.251.1.1:20880
  +-[ Offset: 6053.512583ms; Usage: 3028.777667ms, 33% ] Invoker invoke. Target Address: 10.251.1.1:20880, dubbo version: 3.2.0, current host: 10.251.1.1, error code: 3-4. This may be caused by , go to https://dubbo.apache.org/faq/3/4 to find instructions. 
2023-04-21 17:35:41.147 ERROR 4704 --- [nio-8889-exec-1] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Request processing failed; nested exception is org.apache.dubbo.rpc.RpcException: Failed to invoke the method sayHello in the service dubbo.nacos.demo.DemoService. Tried 3 times of the providers [10.251.1.1:20880] (1/1) from the registry 172.17.0.13:8848 on the consumer 10.251.1.1 using the dubbo version 3.2.0. Last error is: Invoke remote method timeout. method: sayHello, provider: DefaultServiceInstance{serviceName='dubbo-nacos-demo-provider', host='10.251.1.1', port=20880, enabled=true, healthy=true, metadata={dubbo.endpoints=[{"port":20880,"protocol":"dubbo"}], dubbo.metadata.revision=9b326a1e762e969a10c006bea5ed6a02, dubbo.metadata.storage-type=remote, timestamp=1682069664033}}, service{name='dubbo.nacos.demo.DemoService',group='null',version='1.0.0',protocol='dubbo',port='20880',params={executor-management-mode=isolation, side=provider, file-cache=true, release=3.2.0, methods=sayHello, deprecated=false, dubbo=2.0.2, interface=dubbo.nacos.demo.DemoService, service-name-mapping=true, version=1.0.0, timeout=5000, generic=false, revision=1.0.0, metadata-type=remote, application=dubbo-nacos-demo-provider, prefer.serialization=fastjson2,hessian2, background=false, dynamic=true, anyhost=true},}, cause: org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer. start time: 2023-04-21 17:35:38.101, end time: 2023-04-21 17:35:41.128, client elapsed: 2 ms, server elapsed: 3025 ms, timeout: 3000 ms, request: Request [id=5, version=2.0.2, twoWay=true, event=false, broken=false, mPayload=0, data=null], channel: /10.251.1.1:56535 -> /10.251.1.1:20880] with root cause

Then consumer throws org.apache.dubbo.remoting.TimeoutException: Waiting server-side response timeout by scan timer It's weird, Then I checked the nacos:

image

The table only contains info of provider, is it normal?

songxiaosheng commented 1 year ago

The remote configuration is centralized and does not require RPC. You can manually access this RPC service on the provider's machine through the QoS tool to see if the provider's interface is working properly

QAQdev commented 1 year ago

The remote configuration is centralized and does not require RPC. You can manually access this RPC service on the provider's machine through the QoS tool to see if the provider's interface is working properly

Fine, but if I remove metadata-type: remote, consumer side will fail to get metadata, which returns to the very first state. can you tell me how to confirm if the provider has received the request? The current situation is:

  1. ping and telnet show the network is fine
  2. nacos console shows the provider is registered
  3. consumer fails to get metadata and therefore throws RPCExpection

Can you show me the possible solution more clearly? I have been configuring for nearly one day...

QAQdev commented 1 year ago

…Found a not so elegant way to solve this issue, 10.251.1.1 may be an address that generated by nacos which consumer cannot access (though ping and telnet show no problem). In IDEA, you can Edit Configurations -> Modify Options -> Add VM Options, then add -DDUBBO_IP_TO_REGISTRY=[YOUR_PC_IP] image

twodog-1024 commented 8 months ago

我也碰到了这个问题,我的consumer服务里依赖了好几个不同的provider服务,只用一个provider的service出现了Failed to get app metadata的错误。看你最后的解决办法,这个问题看起来像是provider端没有上报元数据或者上报元数据失败导致的?

QAQdev commented 8 months ago

我也碰到了这个问题,我的consumer服务里依赖了好几个不同的provider服务,只用一个provider的service出现了Failed to get app metadata的错误。看你最后的解决办法,这个问题看起来像是provider端没有上报元数据或者上报元数据失败导致的?

yes, i thought it may relate to the reason i pointed out