grpc / grpc-java

The Java gRPC implementation. HTTP/2 based RPC
https://grpc.io/docs/languages/java/
Apache License 2.0
11.4k stars 3.83k forks source link

Unexpected exceptions when server response or client request exceeds Max Message Size #3996

Closed srujann closed 5 months ago

srujann commented 6 years ago

some unexpected exceptions are seen in the server log when a server response exceeds the max message size of the client channel or vice versa. Client log does show valid errors.

This issue is consistently reproducible with a hello world rpc example where request/response is a huge string with size greater than the default message size of 4MiB.

What version of gRPC are you using?

1.9.0

When server response exceeds max message size

Unexpected Exceptions in the server log:

WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Stream closed before write could take place
    at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:149)
    at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:499)
    at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$FlowState.cancel(DefaultHttp2RemoteFlowController.java:480)
    at io.netty.handler.codec.http2.DefaultHttp2RemoteFlowController$1.onStreamClosed(DefaultHttp2RemoteFlowController.java:105)
    at io.netty.handler.codec.http2.DefaultHttp2Connection.notifyClosed(DefaultHttp2Connection.java:349)
    at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.removeFromActiveStreams(DefaultHttp2Connection.java:985)
    at io.netty.handler.codec.http2.DefaultHttp2Connection$ActiveStreams.deactivate(DefaultHttp2Connection.java:941)
    at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:497)
    at io.netty.handler.codec.http2.DefaultHttp2Connection$DefaultStream.close(DefaultHttp2Connection.java:503)
    at io.netty.handler.codec.http2.Http2ConnectionHandler.closeStream(Http2ConnectionHandler.java:587)
    at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onRstStreamRead(DefaultHttp2ConnectionDecoder.java:356)
    at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onRstStreamRead(Http2InboundFrameLogger.java:80)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readRstStreamFrame(DefaultHttp2FrameReader.java:516)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:260)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)
    at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41)
    at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:118)
    at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:388)
    at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:448)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:745)

Client log:

INFO: Will try to greet  ...
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.examples.helloworld.HelloWorldClient greet
WARNING: RPC failed: Status{code=RESOURCE_EXHAUSTED, description=io.grpc.netty.NettyClientTransport$3: Frame size 59000005 exceeds maximum: 4194304. , cause=null}
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream
Jan 24, 2018 6:44:16 PM io.grpc.internal.AbstractClientStream$TransportState inboundDataReceived
INFO: Received data on closed stream

When client request exceeds max message size

Valid exception, followed by some unexpected exception in server log

WARNING: Exception processing message
io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: io.grpc.netty.NettyServerStream$TransportState: Frame size 59000005 exceeds maximum: 4194304. 
    at io.grpc.Status.asRuntimeException(Status.java:517)
    at io.grpc.internal.MessageDeframer.processHeader(MessageDeframer.java:391)
    at io.grpc.internal.MessageDeframer.deliver(MessageDeframer.java:271)
    at io.grpc.internal.MessageDeframer.request(MessageDeframer.java:165)
    at io.grpc.internal.AbstractStream$TransportState.requestMessagesFromDeframer(AbstractStream.java:202)
    at io.grpc.netty.NettyServerStream$Sink$1.run(NettyServerStream.java:100)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:745)

Jan 24, 2018 6:54:21 PM io.grpc.netty.NettyServerHandler onStreamError
WARNING: Stream Error
io.netty.handler.codec.http2.Http2Exception$StreamException: Received DATA frame for an unknown stream 3
    at io.netty.handler.codec.http2.Http2Exception.streamError(Http2Exception.java:129)
    at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.shouldIgnoreHeadersOrDataFrame(DefaultHttp2ConnectionDecoder.java:535)
    at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder$FrameReadListener.onDataRead(DefaultHttp2ConnectionDecoder.java:187)
    at io.netty.handler.codec.http2.Http2InboundFrameLogger$1.onDataRead(Http2InboundFrameLogger.java:48)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readDataFrame(DefaultHttp2FrameReader.java:421)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.processPayloadState(DefaultHttp2FrameReader.java:251)
    at io.netty.handler.codec.http2.DefaultHttp2FrameReader.readFrame(DefaultHttp2FrameReader.java:160)
    at io.netty.handler.codec.http2.Http2InboundFrameLogger.readFrame(Http2InboundFrameLogger.java:41)
    at io.netty.handler.codec.http2.DefaultHttp2ConnectionDecoder.decodeFrame(DefaultHttp2ConnectionDecoder.java:118)
    at io.netty.handler.codec.http2.Http2ConnectionHandler$FrameDecoder.decode(Http2ConnectionHandler.java:388)
    at io.netty.handler.codec.http2.Http2ConnectionHandler.decode(Http2ConnectionHandler.java:448)
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:489)
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:428)
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1359)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:935)
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:134)
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
    at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
    at java.lang.Thread.run(Thread.java:745)

Client log:

Exception in thread "main" io.grpc.StatusRuntimeException: CANCELLED: HTTP/2 error code: CANCEL
Received Rst Stream
    at io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:221)
    at io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:202)
    at io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:131)
    at io.grpc.examples.helloworld.GreeterGrpc$GreeterBlockingStub.sayHello(GreeterGrpc.java:160)
    at io.grpc.examples.helloworld.HelloWorldClient.greet(HelloWorldClient.java:60)
    at io.grpc.examples.helloworld.HelloWorldClient.main(HelloWorldClient.java:77)
ejona86 commented 6 years ago

There's a lot here. To confirm I understand, there's two problems:

  1. Log spam on client and server (in the various cases)
  2. A poor error client-side error when client exceeds max size

Is that right?

srujann commented 6 years ago

Thats correct. In addition:

  1. There is no useful error on server-side when server response exceeds max size.
RiyaTyagi commented 5 years ago

@srujann @ejona I also tried to run the helloworld example with request message having size more than 4MB and I just got the valid exceptions in server-side log (there was no unexpected exception) and also in the client side log I got a decent message

Jun 27, 2019 11:15:41 AM io.grpc.examples.helloworld.HelloWorldClient greet
INFO: Will try to greet ...  
Jun 27, 2019 11:15:41 AM io.grpc.examples.helloworld.HelloWorldClient greet
WARNING: RPC failed: Status{code=CANCELLED, description=HTTP/2 error code: CANCEL
Received Rst Stream, cause=null}

But when the response message has size greater than 4MB, I also got the same exceptions.

jclinn commented 3 years ago

@ejona86 Is this bug going to be fixed any time soon? I am using version 1.35.1 and still seeing the issue where the client is not receiving an informative message, in the case, the client sends a request exceeding 4 MB.

Server logs: {"level":"WARNING","logNameSource":"io.grpc.netty.NettyServerStream","message":"Exception processing message","extendedMessage":"io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: gRPC message exceeds maximum size 4194304: 6144592"}

Client logs: Status{code=CANCELLED, description=RST_STREAM closed stream. HTTP/2 error code: CANCEL, cause=null}

meoyawn commented 2 years ago

Python server: fine

grpc-java v1.44.0 grpc-kotlin v1.2.1

OpenJDK 17.0.1 on macOS : all fine

OpenJDK 11.0.11 on Ubuntu:

io.grpc.StatusException: RESOURCE_EXHAUSTED: Received message larger than max (4516559 vs. 4194304)
    at io.grpc.Status.asException(Status.java:550)
    at io.grpc.kotlin.ClientCalls$rpcImpl$1$1$1.onClose(ClientCalls.kt:295)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
    at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)

Config:

val maxMsgBytes = DecimalByteUnit.MEGABYTES.toBytes(100)

val serviceConfig = mapOf(
    "methodConfig" to listOf(
        mapOf(
            "name" to listOf(mapOf("service" to "yanic.Yanic")),
            "retryPolicy" to mapOf(
                "maxAttempts" to "20",
                "initialBackoff" to "0.1s",
                "maxBackoff" to "10s",
                "backoffMultiplier" to "2",
                "retryableStatusCodes" to listOf("UNAVAILABLE"),
            ),
            "waitForReady" to true,
            "maxRequestMessageBytes" to maxMsgBytes.toString(),
            "maxResponseMessageBytes" to maxMsgBytes.toString(),
        ),
    ),
)

val chan =
    Grpc.newChannelBuilderForAddress("localhost", port, InsecureChannelCredentials.create())
        .enableRetry()
        .defaultServiceConfig(serviceConfig)
        .retryBufferSize(maxMsgBytes)
        .maxInboundMessageSize(maxMsgBytes.toInt())
        .maxInboundMetadataSize(maxMsgBytes.toInt())
        .build()

val opts = CallOptions.DEFAULT
    .withWaitForReady()
    .withMaxInboundMessageSize(maxMsgBytes.toInt())
    .withMaxOutboundMessageSize(maxMsgBytes.toInt())
ryanpbrewster commented 6 months ago

I believe I have reproduced this issue on v1.64.0-SNAPSHOT (aka the tip of master) and understand the issue, at least in the Netty-based gRPC implementation:

So the client sees RST_STREAM with no context.

If we want the client to receive the status, I think we'll need to explicitly send trailers (which should also serve to close the stream). I think that https://github.com/ryanpbrewster/grpc-java/pull/1 works, but will clean it up and send it out for review.

ejona86 commented 6 months ago

The difficulty with fixing this is the deframer and framer are running on different threads, and there's no way for us to enqueue work to the framing thread. Cancellation/resets are already expected to be racy and handled cleanly. But if the deframer closes with trailers that 1) can trigger in the middle of sending a message which the receiver won't like (but that's probably okay), and 2) will require the netty handler to notice and throw away later data and trailers created by the framer.

To explain (1) a bit more, when serializing protobuf, the framer produces chunks of data containing 0-many messages. A large message is split into multiple chunks of data. Those are passed via a queue from the framing thread to the netty handler where they are sent. Any trailer triggered by the deframer will go into that same queue.

ryanpbrewster commented 6 months ago

If the deframer enqueues a command to send trailers + close, my understanding is that:

ejona86 commented 6 months ago

making the receiver unhappy is probably okay, because the stream is going to be closed one way or another, and at least this way they get some context about why

Well, my point was right now you wouldn't, in certain cases. You will see an error about a truncated message. I still think that's okay, and is a step in the right direction. We just may need future work in the receiver to expose "an error in the trailers if the trailers had an error, otherwise generate an error."

if the stream is closed, won't the handler automatically notice & discard any future data

Once you get to a certain level, yes. But plenty of this code has never needed to deal with this so 1) we need to audit/be prepared for broken code and 2) consider how errors are handled.

ryanpbrewster commented 6 months ago

Makes sense, and understood. I'm going to open a PR and link it to this issue. If you don't mind, I'll tag you there and we can discuss the specific implementation.