eclipse-vertx / vertx-sql-client

High performance reactive SQL Client written in Java
Apache License 2.0
894 stars 200 forks source link

PgSubscriber occasionally crashes unexpectedly #1442

Closed s-gal closed 2 months ago

s-gal commented 5 months ago

Version

vert.x 4.5.7

Context

I have a problem with the io.vertx.pgclient.pubsub.PgSubscriber class provided by io.vertx:vertx-pg-client.

Ocassionally the error from the following stack trace appears and it makes my whole PgSubscriber thread crash. (Every 10.000 times maybe once and maybe connected with a high load of the server at that time.)

2024-05-21T09:48:58,807 - java.lang.NullPointerException: Cannot invoke "io.vertx.pgclient.impl.codec.PgCommandCodec.handleNoticeResponse(io.vertx.pgclient.impl.codec.NoticeResponse)" because the return value of "io.vertx.pgclient.impl.codec.PgCodec.peek()" is null
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.decodeNotice(PgDecoder.java:276)
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.decodeMessage(PgDecoder.java:147)
2024-05-21T09:48:58,808 -  at io.vertx.pgclient.impl.codec.PgDecoder.channelRead(PgDecoder.java:123)
2024-05-21T09:48:58,808 -  at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
2024-05-21T09:48:58,808 -  at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
2024-05-21T09:48:58,808 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
2024-05-21T09:48:58,809 -  at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
2024-05-21T09:48:58,809 -  at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
2024-05-21T09:48:58,809 -  at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
2024-05-21T09:48:58,809 -  at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
2024-05-21T09:48:58,809 -  at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
2024-05-21T09:48:58,809 -  at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

This error seemes to appear in the core of the pgclient classes and i also have no option if the error haoppens to send out an error mail or to restart my process. Currently i initialise the process like the following but as mentioned the catch blocks are never reached and no log here is printed out or mails sent out.

PgSubscriber subscriber = PgSubscriber.subscriber(vertx, new PgConnectOptions().setPort(dbConfig.getPort())
                    .setHost(dbConfig.getHost()).setDatabase(dbConfig.getDatabase()).setUser(dbConfig.getUser()).setPassword(dbConfig.getPassword()));

            subscriber.connect((ar -> {
                if (ar.succeeded()) {

                    String testSuffix = Config.getInstance().getTest() ? "_test" : "";

                    subscriber.channel("pubsub_shipment" + testSuffix).handler(payload -> {
                        try {
                            if(!StringUtil.isEmpty(payload)) {
                                log.info("Received pubsub_shipment: " + payload);
                                updateInMemoryCollectionsForShipments(new JsonArray().add(payload));
                            }
                        }
                        catch(Exception e) {
                            log.error("Unexpected Error in pubsub for shipments", e);
                            MailUtil.sendTechnicalErrorMail("Shipment PubSub failed", "Unexpected Error in pubsub for shipments:", e);
                        }
                    }).exceptionHandler(event -> {
                        log.error("Error in pubsub for shipments", event);
                        MailUtil.sendTechnicalErrorMail("Shipment PubSub failed", "Error in pubsub for shipments:", event);
                    });

So how i can i prevent this error from happening? And how to set an try/catch or Exception Handler to be able to send out Error Mails ir to reinitiate the subscriber when this error occurs? As you can see in the error log no error log from my application are visible. Only from the internal pgclient classes.

s-gal commented 5 months ago

I can imagine this error is very hard to reprdouce. Is it possible at least to surround the place where the Nullpointer happens with a try/catch and print out a warning in the next vertx version, so that the whole process is not crashing? As said when this error occurs neither my catch block or the ExceptionHandler is reached and i am unable to restart the Subscriber process again or to receive a mail that this error occured. So i need to manually check right now every day if the Subscriber is still running or if it might crashed again.

tsegismont commented 2 months ago

Fixed by c4881b439864139ca308ab101619b1658ee39c37