grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.45k stars 3.85k forks source link

Cardinality violations should use error code “unimplemented” #11247

Open jhump opened 5 months ago

jhump commented 5 months ago

The gRPC docs for error codes state that both client and server should use the unimplemented code for cardinality violations. See table at the bottom of this doc (you can search for “cardinality violation” in the doc): https://grpc.github.io/grpc/core/md_doc_statuscodes.html.

A cardinality violation is when a stream contains an incorrect number of messages. Specifically, when a response stream for a unary or client-stream RPC contains zero messages with an OK status or more than one message; or when a request stream for a unary or server-stream RPC contains zero or more than one messages.

The server in this repo reports an exception with an internal error in this situation instead of unimplemented.

The client’s behavior is a little more… interesting:

  1. For unary RPCs, if no response messages are present in the response, only an “OK” status, an exception with an internal error is thrown.
  2. For client-stream RPCs, if no response messages are present in the response, only an “OK” status, there is no error at all. The onNext and onError methods of the application’s StreamObserver are never called, but the onCompleted method is.
  3. For both unary and client-stream RPCs, if a second response message is erroneously sent by the server before the trailers/status, an error is reported that has a cancelled code and a message of “Failed to read message.” Presumably the code comes from the fact that the library is likely trying to cancel the operation to prevent receipt of any further unwanted messages in the response stream.
ejona86 commented 5 months ago

This was apparently added without much (any) discussion. UNIMPLEMENTED makes me sad as I am still waiting for someone to make the case that grpc responding 200 all the time is bad. At that point we'd convert UNIMPLEMENTED to 404. (Yes, compression method is UNIMPLEMENTED, but there was a gentleman's agreement that that case wouldn't be used to say 404 is inappropriate.) Although in other ways UNIMPLEMENTED for this case is probably a better fit than for compression.

I'm not checking all the cases, but at least some of these cases on both client and server would be easy to change the status code: https://github.com/grpc/grpc-java/blob/781b4c4575218278614948902a6873304d29b4e4/stub/src/main/java/io/grpc/stub/ClientCalls.java#L463-L465

https://github.com/grpc/grpc-java/blob/781b4c4575218278614948902a6873304d29b4e4/stub/src/main/java/io/grpc/stub/ServerCalls.java#L158

ejona86 commented 5 months ago

It seems likely only Python, and maybe some later implementations like .Net, is doing this behavior.

Edit: C++ is also doing UNIMPLEMENTED

ejona86 commented 4 months ago

Some TLs discussed this, and we strongly agreed UNIMPLEMENTED is wrong. A particularly bad case is when the client detects the cardinality error, as the server has already processed the request and sent responses and only afterward does the client tell the application the RPC is UNIMPLEMENTED. That will certainly mislead the client.

The only safe choices are INTERNAL and UNKNOWN, and UNKNOWN was agreed to be inappropriate because we know what error happened. At some point I'll work on getting the text updated.

jhump commented 3 months ago

@ejona86, thanks for the transparency about the decision making. IIUC, you're saying that you expect the spec to be amended to instead use INTERNAL for this case instead of UNIMPLEMENTED. Is that right?

ejona86 commented 3 months ago

@jhump, yes. I would have updated the spec already with the one-line change, but it seems this deserves to go through a gRFC, and I haven't spent the time to write it up yet. Just this morning I was looking at this issue and thinking I'd have time this week.