grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.4k stars 3.83k forks source link

netty's flow control mechanism result in grpc server side‘s Bi-directional streaming direct memory leak #10733

Closed shiyiyue1102 closed 10 months ago

shiyiyue1102 commented 10 months ago

grpc-java version:1.50.2

problem description:

we use grpc's Bi-directional streaming to push messages from server to client , and some online clusters occur direct memory leak . after memory analysising ,we. found that many bytebufs are backed up in the DefaultHttp2RemoteFlowController.FlowState's pendingWriteQueue and the window value is 0. After analysis,we consider some reasons that may cause this problem:

  1. client side's processing speed is much lower than server side's
  2. some network proxy bettween client and server drop packages ,and once the amount of drops packages reached to window size,the window is permanently kept at 0,and this Bi-directional streaming is not available,all packages will be back up in FlowState's pendingWriteQueue ,and the only way to solved this problem is recreating a new stream.

really hope to receive your guidance.

ejona86 commented 10 months ago

If the buffers are in pendingWriteQueue, then there is no leak.

As long as the producer is faster than the consumer, and there's work to do, the window will stay at 0. That doesn't mean there's no progress. As soon as the window increases, more data is sent, and the widow goes back to 0.

The only thing you need to make sure to do is observe isReady() to slow sending when the receiver can't keep up.

shiyiyue1102 commented 10 months ago

Thank you for your reply. We have figure out the ultimate cause that triggered my issue,just as you said above. We triggered frequently cmsgcs of client, and keep the underlying tcp client is still working , we reproduced this issue.

If the buffers are in pendingWriteQueue, then there is no leak.

As long as the producer is faster than the consumer, and there's work to do, the window will stay at 0. That doesn't mean there's no progress. As soon as the window increases, more data is sent, and the widow goes back to 0.

The only thing you need to make sure to do is observe isReady() to slow sending when the receiver can't keep up.

ejona86 commented 10 months ago

Seems like this is resolved. If not, comment, and it can be reopened.