googleapis / java-spanner

Apache License 2.0
54 stars 111 forks source link

spanner.connection.it.ITTransactionRetryTest: testNestedAbortWithConcurrentInsert failed #1984

Closed flaky-bot[bot] closed 1 year ago

flaky-bot[bot] commented 2 years ago

This test failed!

To configure my behavior, see the Flaky Bot documentation.

If I'm commenting on this issue too often, add the flakybot: quiet label and I will stop commenting.


commit: 6abb01652b23112a2eafb87a3819f32c3b36452e buildURL: Build Status, Sponge status: failed

Test output
com.google.cloud.spanner.SpannerException: DEADLINE_EXCEEDED: com.google.api.gax.rpc.DeadlineExceededException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 8.579274198s. [closed=[], open=[[remote_addr=spanner.googleapis.com/74.125.197.95:443]]]
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:291)
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:297)
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:61)
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:181)
    at com.google.cloud.spanner.spi.v1.GapicSpannerRpc.get(GapicSpannerRpc.java:1784)
    at com.google.cloud.spanner.spi.v1.GapicSpannerRpc.executeQuery(GapicSpannerRpc.java:1563)
    at com.google.cloud.spanner.TransactionRunnerImpl$TransactionContextImpl.internalExecuteUpdate(TransactionRunnerImpl.java:710)
    at com.google.cloud.spanner.TransactionRunnerImpl$TransactionContextImpl.executeUpdate(TransactionRunnerImpl.java:694)
    at com.google.cloud.spanner.SessionPool$SessionPoolTransactionContext.executeUpdate(SessionPool.java:730)
    at com.google.cloud.spanner.connection.ReadWriteTransaction.lambda$internalExecuteUpdateAsync$3(ReadWriteTransaction.java:436)
    at com.google.cloud.spanner.connection.ReadWriteTransaction.runWithRetry(ReadWriteTransaction.java:673)
    at com.google.cloud.spanner.connection.ReadWriteTransaction.lambda$internalExecuteUpdateAsync$4(ReadWriteTransaction.java:421)
    at io.grpc.Context$2.call(Context.java:583)
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
    Suppressed: com.google.cloud.spanner.connection.AbstractBaseUnitOfWork$SpannerAsyncExecutionException: Execution failed for statement: INSERT INTO TEST (ID, NAME) VALUES (3, 'test 3')
        at com.google.cloud.spanner.connection.AbstractBaseUnitOfWork.executeStatementAsync(AbstractBaseUnitOfWork.java:242)
        at com.google.cloud.spanner.connection.ReadWriteTransaction.internalExecuteUpdateAsync(ReadWriteTransaction.java:417)
        at com.google.cloud.spanner.connection.ReadWriteTransaction.executeUpdateAsync(ReadWriteTransaction.java:404)
        at com.google.cloud.spanner.connection.ConnectionImpl.internalExecuteUpdateAsync(ConnectionImpl.java:1171)
        at com.google.cloud.spanner.connection.ConnectionImpl.executeUpdate(ConnectionImpl.java:977)
        at com.google.cloud.spanner.connection.it.ITTransactionRetryTest.testNestedAbortWithConcurrentInsert(ITTransactionRetryTest.java:981)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at org.junit.runners.Suite.runChild(Suite.java:128)
        at org.junit.runners.Suite.runChild(Suite.java:27)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
        at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
        at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeLazy(JUnitCoreWrapper.java:119)
        at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:87)
        at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
        at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:158)
        at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:456)
        at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:169)
        at org.apache.maven.surefire.booter.ForkedBooter.run(ForkedBooter.java:595)
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:581)
Caused by: java.util.concurrent.ExecutionException: com.google.api.gax.rpc.DeadlineExceededException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 8.579274198s. [closed=[], open=[[remote_addr=spanner.googleapis.com/74.125.197.95:443]]]
    at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:588)
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:567)
    at com.google.cloud.spanner.spi.v1.GapicSpannerRpc.get(GapicSpannerRpc.java:1778)
    ... 14 more
Caused by: com.google.api.gax.rpc.DeadlineExceededException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 8.579274198s. [closed=[], open=[[remote_addr=spanner.googleapis.com/74.125.197.95:443]]]
    at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:94)
    at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:41)
    at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:86)
    at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:66)
    at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
    at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:67)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1270)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1038)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:808)
    at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:572)
    at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:542)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:535)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:562)
    at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:743)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:722)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    ... 3 more
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 8.579274198s. [closed=[], open=[[remote_addr=spanner.googleapis.com/74.125.197.95:443]]]
    at io.grpc.Status.asRuntimeException(Status.java:535)
    ... 25 more
flaky-bot[bot] commented 2 years ago

Looks like this issue is flaky. :worried:

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (6abb01652b23112a2eafb87a3819f32c3b36452e), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

ansh0l commented 2 years ago

cc @harshachinta, please take a look

arpan14 commented 1 year ago

Investigated the test class ITTransactionRetryTest and I don't see any reasons why this issue can happen. It looks like a transient issue on the RPC response time. The query used in this case is a simple insert with an integer primary key and string columns. There are no obvious reasons on if the response delay could be due to row hotspots.

Resolving this issue for now. If this comes up more frequently we will require to investigate row hotspots as a possible root cause.