baidu / Jprotobuf-rpc-socket

Protobuf RPC是一种基于TCP协议的二进制RPC通信协议的Java实现
Apache License 2.0
530 stars 221 forks source link

RpcChannel在发送失败后未通知回调,导致长时间hang住直到超时。 #94

Open liming30 opened 8 months ago

liming30 commented 8 months ago

Netty channel 在发送消息到server端时可能出现失败,但是目前没有对channelFuture进行处理,导致只能依靠超时来确保不会完全hang住,但这种方式丢失了正确的异常栈,让问题很难排查。 https://github.com/baidu/Jprotobuf-rpc-socket/blob/master/jprotobuf-rpc-core/src/main/java/com/baidu/jprotobuf/pbrpc/transport/RpcChannel.java#L141

这是使用 arthas 捕获到的一个特殊case的异常信息,但是在任何地方都没有对异常进行处理,最终只能等待RPC调用超时。

method=io.netty.channel.AbstractChannelHandlerContext.writeAndFlush location=AtExit
ts=2024-02-20 11:52:40; [cost=0.064681ms] result=@ArrayList[
    io.netty.handler.codec.EncoderException: java.lang.RuntimeException: Negative initial size: -736704836
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:104)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:863)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:968)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:856)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:110)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:881)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:863)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:968)
    at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:856)
    at io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:304)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:879)
    at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:940)
    at io.netty.channel.AbstractChannelHandlerContext$WriteTask.run(AbstractChannelHandlerContext.java:1247)
    at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
    at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
    at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: Negative initial size: -736704836
    at com.baidu.jprotobuf.pbrpc.data.RpcDataPackage.write(RpcDataPackage.java:687)
    at com.baidu.jprotobuf.pbrpc.transport.handler.RpcDataPackageEncoder.encode(RpcDataPackageEncoder.java:88)
    at com.baidu.jprotobuf.pbrpc.transport.handler.RpcDataPackageEncoder.encode(RpcDataPackageEncoder.java:1)
    at io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:89)
    ... 21 more
Caused by: java.lang.IllegalArgumentException: Negative initial size: -736704836
    at java.base/java.io.ByteArrayOutputStream.<init>(ByteArrayOutputStream.java:76)
    at com.baidu.jprotobuf.pbrpc.data.RpcDataPackage.write(RpcDataPackage.java:668)
    ... 24 more

我认为我们应该在发送失败后调用callback,以保证正确的异常信息被透出。