apache / brpc

brpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".
https://brpc.apache.org
Apache License 2.0
16.56k stars 3.98k forks source link

[RDMA] 启用RDMA后client侧解析响应失败 #2814

Open SimonCqk opened 2 weeks ago

SimonCqk commented 2 weeks ago

Describe the bug (描述bug)

应用场景可以简化为client发起rpc请求,server收到请求后取回数据,并通过response attachment返回,追踪日志看:

  1. fuse->rpc->worker 完成
  2. worker收到响应->take data-> append to attachment完成
  3. fuse侧hang住,直到报错
E1031 07:32:53.252275 44 input_messenger.cpp:123] Fail to parse response from 33.51.173.89:19893 by baidu_std at client-side
W1031 07:32:53.252285 44 input_messenger.cpp:249] Close Socket{id=7 fd=1053 addr=33.51.173.89:19893:33384} (0x7f1d1765c780): absolutely wrong message
I1031 07:32:53.353038 43 socket.cpp:2606] Checking Socket{id=7 addr=33.51.173.89:19893} (0x7f1d1765c780)
I1031 07:32:53.353356 44 socket.cpp:2666] Revived Socket{id=7 addr=33.51.173.89:19893} (0x7f1d1765c780) (Connectable)

参考https://github.com/apache/brpc/issues/2265 对brpc收发方向的消息都打了日志,可见主要是正常的消息的和server所在进程,send方向的空数据: image

worker响应的代码很简单,blockletBuffer打印出之后确实是有数据的,没有进一步的排查头绪了。

_cntl->response_attachment().append(blockletBuffer->data(),  blockletBuffer->size());

To Reproduce (复现方法)

Expected behavior (期望行为)

Versions (各种版本) OS: Compiler: brpc: 1.11.0 protobuf:

Additional context/screenshots (更多上下文/截图)

yanglimingcn commented 1 week ago

使用rdma_performance来测试你这个协议的请求和应答试试呢?这样能缩小问题的排查范围,更好的定位问题点。