grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.26k stars 3.79k forks source link

Unexpected request cancellations after upgrading gRPC-Java #11346

Open evis opened 4 days ago

evis commented 4 days ago

What version of gRPC-Java are you using?

1.64.0

What is your environment?

Docker, JDK 21

What did you expect to see?

After upgrading gRPC-Java from 1.38.0 to 1.56.0, no requests should be cancelled, maintaining the stability observed in the previous version.

What did you see instead?

Some requests are immediately cancelled. Updating further to 1.64.0 did not resolve this issue. Downgrading back to 1.38.0 resolves the problem.

Steps to reproduce the bug

I do not have a minimal example at the moment. The issue can be reproduced by deploying our service in a production environment.

Additional information

Clients with the following headers experience the issue:

"user-agent": "grpc-python-asyncio/1.62.1 grpc-c/39.0.0 (linux; chttp2)"
"grpc-timeout": "200m"

or

"user-agent": "grpc-go/1.58.1"
"grpc-timeout": "4999973u"

When these clients make requests to the server running gRPC-Java, the server responds immediately (in less than 1 ms) with a status code CANCELLED, caused by StatusRuntimeException: CANCELLED: RPC cancelled.

Most requests do not encounter this issue. Out of approximately 10,000 requests per second, only about 20 requests per second are cancelled.

Our developer increased the timeout from 200 milliseconds to 1 second on the client side. After increasing the client-side timeout to 1 second, the server began responding with CANCELLED to every request, whereas previously, only some requests were being cancelled with a 200 ms timeout.

I am considering bisecting gRPC-Java versions to identify the specific version introducing this behavior.

Questions

  1. Are there any known changes in gRPC-Java that could lead to this behavior?
  2. Does the "RPC cancelled" error indicate client-side cancellation?

Any guidance or insights on these questions would be greatly appreciated.

linjikun commented 4 hours ago

I used 1.50, jdk 1.8 , also report StatusRuntimeException: CANCELLED: RPC cancelled