eclipse-vertx / vertx-sql-client

High performance reactive SQL Client written in Java
Apache License 2.0
888 stars 199 forks source link

"Needed to have 6 in buffer but only had 0" #899

Open gavinking opened 3 years ago

gavinking commented 3 years ago

In the HR test suite I had to disable a test, as seen here:

https://github.com/hibernate/hibernate-reactive/blob/2ff95a963d18c863853f6258490d4ace341eb00c/hibernate-reactive-core/src/test/java/org/hibernate/reactive/OrderedEmbeddableCollectionTest.java#L32

because of an intermittent error that only occurred on DB2 when running the whole test suite (and not the single test).

From the error message it looks like this is something y'all know about already.

ERROR errors [vert.x-eventloop-thread-0] failed to execute statement [select orderedemb0_.id as id1_0_0_, orderedemb0_.title as title2_0_0_ from EAuthor orderedemb0_ where orderedemb0_.id=?]
ERROR errors [vert.x-eventloop-thread-0] could not load an entity: [org.hibernate.reactive.OrderedEmbeddableCollectionTest$Author#1]
java.util.concurrent.CompletionException: java.lang.IllegalStateException: Needed to have 6 in buffer but only had 0. In JDBC we would normally block here but need to find a non-blocking solution
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?]
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632) ~[?:?]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) ~[?:?]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) ~[?:?]
    at org.hibernate.reactive.pool.impl.Handlers.lambda$toCompletionStage$0(Handlers.java:29) ~[main/:?]
    at io.vertx.sqlclient.impl.SqlResultHandler.fail(SqlResultHandler.java:103) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.SqlResultHandler.handle(SqlResultHandler.java:89) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.SqlResultHandler.handle(SqlResultHandler.java:33) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.TransactionImpl.lambda$null$3(TransactionImpl.java:181) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.TransactionImpl.lambda$rollback$2(TransactionImpl.java:140) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.SocketConnectionBase.handleMessage(SocketConnectionBase.java:241) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.sqlclient.impl.SocketConnectionBase.lambda$init$0(SocketConnectionBase.java:88) ~[vertx-sql-client-3.9.5.jar:3.9.5]
    at io.vertx.core.net.impl.NetSocketImpl.lambda$new$2(NetSocketImpl.java:101) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.streams.impl.InboundBuffer.handleEvent(InboundBuffer.java:237) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.streams.impl.InboundBuffer.write(InboundBuffer.java:127) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.net.impl.NetSocketImpl.handleMessage(NetSocketImpl.java:357) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.impl.ContextImpl.executeTask(ContextImpl.java:366) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.impl.EventLoopContext.execute(EventLoopContext.java:43) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:229) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:163) ~[vertx-core-3.9.5.jar:3.9.5]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) ~[netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.vertx.db2client.impl.codec.DB2Encoder.lambda$write$0(DB2Encoder.java:72) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.codec.DB2Decoder.decodePayload(DB2Decoder.java:97) [vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.codec.DB2Decoder.decode(DB2Decoder.java:53) [vertx-db2-client-3.9.5.jar:3.9.5]
    at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:501) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:440) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:276) [netty-codec-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) [netty-transport-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [netty-common-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.49.Final.jar:4.1.49.Final]
    at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.49.Final.jar:4.1.49.Final]
    at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.IllegalStateException: Needed to have 6 in buffer but only had 0. In JDBC we would normally block here but need to find a non-blocking solution
    at io.vertx.db2client.impl.drda.DRDAResponse.ensureALayerDataInBuffer(DRDAResponse.java:1462) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.drda.DRDAResponse.readDssHeader(DRDAResponse.java:878) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.drda.DRDAResponse.startSameIdChainParse(DRDAResponse.java:52) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.drda.DRDAQueryResponse.readDescribeInput(DRDAQueryResponse.java:81) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.drda.DRDAQueryResponse.readPrepareDescribeInputOutput(DRDAQueryResponse.java:53) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.codec.PrepareStatementCodec.decodePayload(PrepareStatementCodec.java:76) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    at io.vertx.db2client.impl.codec.DB2Decoder.decodePayload(DB2Decoder.java:80) ~[vertx-db2-client-3.9.5.jar:3.9.5]
    ... 21 more
vietj commented 3 years ago

it's an issue I was not aware of and that is specific to DB2, I believe we need to engage with somebody familiar.

mswatosh commented 3 years ago

I'll start taking a look at this.

vietj commented 3 years ago

thank you Mark

On Tue, Feb 23, 2021 at 4:43 PM Mark Swatosh notifications@github.com wrote:

I'll start taking a look at this.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/eclipse-vertx/vertx-sql-client/issues/899#issuecomment-784295171, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABXDCXSCPOKWFCDDKSTZUTTAPEL7ANCNFSM4YAP2FUQ .

vietj commented 3 years ago

@mswatosh any progress on this ?

vietj commented 3 years ago

@mswatosh have you had the chance to investigate this one ?

mswatosh commented 3 years ago

It looks like the issue here is sometimes we haven't received the full response over the network, so this method (ensureALayerDataInBuffer) needs to be updated to 'wait' for enough of the response to continue, but without blocking the thread.

DavideD commented 2 years ago

FYI: This error reappeared after these changes: https://github.com/eclipse-vertx/vertx-sql-client/pull/1204

mswatosh commented 2 years ago

I tried this test locally and I'm able to see the issue. It doesn't appear to be intermittent, it occurs consistently for me, even if I reduce the test case to:

test(
    context,
    getMutinySessionFactory()
        .withTransaction( (session, transaction) -> session.persistAll( author ) )
        .chain( () -> getMutinySessionFactory()
            .withTransaction( (session, transaction) -> session.find( Author.class, author.id )

What's interesting is from this point if I remove either of the .withTransaction statements the test passes.

I'm also seeing a couple of additional exceptions:

Cause 2: java.lang.IllegalStateException: Invalid correlator ID. Got 2 expected 1
    at io.vertx.db2client.impl.drda.DRDAResponse.readDssHeader(DRDAResponse.java:946)
    at io.vertx.db2client.impl.drda.DRDAResponse.startSameIdChainParse(DRDAResponse.java:52)
    at io.vertx.db2client.impl.drda.DRDAQueryResponse.readExecuteImmediate(DRDAQueryResponse.java:230)
    at io.vertx.db2client.impl.codec.SimpleQueryCommandCodec.decodeUpdate(SimpleQueryCommandCodec.java:57)
    at io.vertx.db2client.impl.codec.QueryCommandBaseCodec.decodePayload(QueryCommandBaseCodec.java:71)
    at io.vertx.db2client.impl.codec.DB2Decoder.decodePayload(DB2Decoder.java:79)
    at io.vertx.db2client.impl.codec.DB2Decoder.decode(DB2Decoder.java:52)

as well as

Caused by: java.lang.IllegalStateException: Found unknown codepoint: 0x2411 / 9233
        at io.vertx.db2client.impl.drda.DRDAResponse.throwUnknownCodepoint(DRDAResponse.java:847)
        at io.vertx.db2client.impl.drda.DRDAQueryResponse.parseExecuteImmediateError(DRDAQueryResponse.java:1217)
        at io.vertx.db2client.impl.drda.DRDAQueryResponse.parseEXCSQLIMMreply(DRDAQueryResponse.java:1174)
        at io.vertx.db2client.impl.drda.DRDAQueryResponse.readExecuteImmediate(DRDAQueryResponse.java:231)
        at io.vertx.db2client.impl.codec.SimpleQueryCommandCodec.decodeUpdate(SimpleQueryCommandCodec.java:57)
        at io.vertx.db2client.impl.codec.QueryCommandBaseCodec.decodePayload(QueryCommandBaseCodec.java:71)
        at io.vertx.db2client.impl.codec.DB2Decoder.decodePayload(DB2Decoder.java:79)

The strange thing about the last exception is 0x2411 is a SQLDARD command, which is being parsed as a response to an EXCSQLIMM command. If I look at the wireshark trace, all of the EXCSQLIMM are receiving ENDUOWNM like they should.

So I think the issue here might be the response packets are getting mixed up and ending up at the wrong Db2Decoder.

DavideD commented 2 years ago

What's interesting is from this point if I remove either of the .withTransaction statements the test passes.

Interesting indeed. The second transaction is actually optional but now I don't want to change it if it makes db2 fail :)

DavideD commented 2 years ago

We can refactor the tests. Anyway, let us know if we can help on the Hibernate Reactive front

DavideD commented 1 year ago

FYI: In Hibernate Reactive, more tests are failing because of this error after the upgrade to ORM 6.2.