eclipse-vertx / vertx-sql-client

High performance reactive SQL Client written in Java
Apache License 2.0
891 stars 199 forks source link

NullPointerException in PgDecoder.decodeError() #1401

Closed andreas-eberle closed 8 months ago

andreas-eberle commented 8 months ago

Questions

I'm using the vertx-pg-client in a Quarkus project to subscribe to postgres events with the PgSubscriber. This works all fine and I get the events locally. But in the production environment, it sometimes happens that the PgDecoder.decodeError() function is called. Currently, I don't know why and it always takes a long while until it happens. Anyways, my main problem is that the decodeError() function crashes with a NullPointerException here: https://github.com/eclipse-vertx/vertx-sql-client/blob/master/vertx-pg-client/src/main/java/io/vertx/pgclient/impl/codec/PgDecoder.java#L257

The stack trace is this (line numbers seem to be a bit different because Quarkus uses version 4.4.6 of vertx-pg-client):

java.lang.NullPointerException: Cannot invoke "io.vertx.pgclient.impl.codec.PgCommandCodec.handleErrorResponse(io.vertx.pgclient.impl.codec.ErrorResponse)" because "cmd" is null
        at io.vertx.pgclient.impl.codec.PgDecoder.decodeError(PgDecoder.java:246)
        at io.vertx.pgclient.impl.codec.PgDecoder.decodeMessage(PgDecoder.java:132)
        at io.vertx.pgclient.impl.codec.PgDecoder.channelRead(PgDecoder.java:112)
        at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:1583)

It looks like the decodeError() function executes the peek() function on inflight, which can return null (also see javadoc of peek). And if it returns null, the call to handleErrorResponse() in the next line will crash with a NullPointerException.

In the end, this results in the stack trace appearing in the logs and the channel dying without Quarkus noticing. Therefore, I cannot react on the closed channel and e.g. reconnect.

Version

vertx-pg-client 4.4.6 in Quarkus 3.6.6.

Do you have a reproducer?

No, currently I have no reproducer. It only seems to happen with the specific production cloud database and not with local postgres containers.

Extra

tsegismont commented 8 months ago

Fixed by 2d07fc06