GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.14k stars 951 forks source link

[Bug]: Spanner staging tests are breaking for ImportPipelineIT and ExportPipelineIT #1807

Open Polber opened 3 weeks ago

Polber commented 3 weeks ago

Related Template(s)

Cloud_Spanner_to_GCS_Avro, GCS_Avro_to_Cloud_Spanner

Template Version

HEAD

What happened?

The ImportPipelineIT and ExportPipelineIT tests have been failing for Spanner Staging cases. There were a series of errors like the following:

Caused by: java.lang.IllegalStateException: Column 1 is not of correct type: expected BOOL but was STRING
    com.google.common.base.Preconditions.checkState(Preconditions.java:857)
    com.google.cloud.spanner.AbstractStructReader.checkNonNullOfType(AbstractStructReader.java:616)
    com.google.cloud.spanner.AbstractStructReader.getBoolean(AbstractStructReader.java:155)
    com.google.cloud.teleport.spanner.ddl.InformationSchemaScanner.listPlacements(InformationSchemaScanner.java:1450)
    com.google.cloud.teleport.spanner.ddl.InformationSchemaScanner.scan(InformationSchemaScanner.java:105)
    com.google.cloud.teleport.spanner.ReadInformationSchema$ReadInformationSchemaFn.processElement(ReadInformationSchema.java:96)

that seem to have been resolved as of #1792, however the below log output shows the current output.

Since this appears to only occur on staging spanner tests, it may be due to an incompatible change on the staged spanner instance, in which case disabling these tests will be the temporary fix until a patch is released, or the ITs are updated.

Relevant log output

org.apache.beam.it.gcp.spanner.SpannerResourceManagerException: Failed to execute statement.
    at org.apache.beam.it.gcp.spanner.SpannerResourceManager.executeDdlStatements(SpannerResourceManager.java:307)
    at org.apache.beam.it.gcp.spanner.SpannerResourceManager.executeDdlStatement(SpannerResourceManager.java:283)
    at com.google.cloud.teleport.spanner.ExportPipelineIT.testSpannerToGCSAvroBase(ExportPipelineIT.java:210)
    at com.google.cloud.teleport.spanner.ExportPipelineIT.testSpannerToGCSAvroStaging(ExportPipelineIT.java:168)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
    at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
    at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
    at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
    at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:410)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.ExecutionException: com.google.cloud.spanner.SpannerException: INVALID_ARGUMENT: Operation with name "projects/cloud-teleport-testing/instances/teleport/databases/testspa_20240819_204705_ruvcft/operations/r58f2a880_153b_470d_a95a_66c0b992ae2a" failed with status = GrpcStatusCode{transportCode=INVALID_ARGUMENT} and message = Token key MyTokens in search index testSpannerToGCSAvroStaging_SearchIndex does not support order.
    at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:594)
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:573)
    at com.google.common.util.concurrent.FluentFuture$TrustedFuture.get(FluentFuture.java:91)
    at com.google.common.util.concurrent.ForwardingFuture.get(ForwardingFuture.java:67)
    at com.google.api.gax.longrunning.OperationFutureImpl.get(OperationFutureImpl.java:125)
    at org.apache.beam.it.gcp.spanner.SpannerResourceManager.executeDdlStatements(SpannerResourceManager.java:304)
    ... 26 more
Caused by: com.google.cloud.spanner.SpannerException: INVALID_ARGUMENT: Operation with name "projects/cloud-teleport-testing/instances/teleport/databases/testspa_20240819_204705_ruvcft/operations/r58f2a880_153b_470d_a95a_66c0b992ae2a" failed with status = GrpcStatusCode{transportCode=INVALID_ARGUMENT} and message = Token key MyTokens in search index testSpannerToGCSAvroStaging_SearchIndex does not support order.
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerExceptionPreformatted(SpannerExceptionFactory.java:291)
    at com.google.cloud.spanner.SpannerExceptionFactory.fromApiException(SpannerExceptionFactory.java:311)
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:174)
    at com.google.cloud.spanner.SpannerExceptionFactory.newSpannerException(SpannerExceptionFactory.java:110)
    at com.google.cloud.spanner.DatabaseAdminClientImpl.lambda$updateDatabaseDdl$11(DatabaseAdminClientImpl.java:470)
    at com.google.api.core.ApiFutures$ApiFunctionToGuavaFunction.apply(ApiFutures.java:396)
    at com.google.common.util.concurrent.AbstractCatchingFuture$CatchingFuture.doFallback(AbstractCatchingFuture.java:237)
    at com.google.common.util.concurrent.AbstractCatchingFuture$CatchingFuture.doFallback(AbstractCatchingFuture.java:225)
    at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:135)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:809)
    at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:129)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
    at com.google.api.gax.retrying.BasicRetryingFuture.handleAttempt(BasicRetryingFuture.java:203)
    at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.handle(CallbackChainRetryingFuture.java:135)
    at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.run(CallbackChainRetryingFuture.java:115)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
    at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.setResult(AbstractTransformFuture.java:259)
    at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:171)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
    at com.google.api.gax.retrying.BasicRetryingFuture.handleAttempt(BasicRetryingFuture.java:203)
    at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.handle(CallbackChainRetryingFuture.java:135)
    at com.google.api.gax.retrying.CallbackChainRetryingFuture$AttemptCompletionListener.run(CallbackChainRetryingFuture.java:115)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
    at com.google.api.core.AbstractApiFuture$InternalSettableFuture.set(AbstractApiFuture.java:87)
    at com.google.api.core.AbstractApiFuture.set(AbstractApiFuture.java:70)
    at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onSuccess(GrpcExceptionCallable.java:88)
    at com.google.api.core.ApiFutures$1.onSuccess(ApiFutures.java:89)
    at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1137)
    at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1298)
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:1059)
    at com.google.common.util.concurrent.AbstractFuture.set(AbstractFuture.java:784)
    at io.grpc.stub.ClientCalls$GrpcFuture.set(ClientCalls.java:563)
    at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:536)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at com.google.api.gax.grpc.ChannelPool$ReleasingClientCall$1.onClose(ChannelPool.java:570)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor$1$1.onClose(SpannerErrorInterceptor.java:100)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at io.grpc.census.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:814)
    at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
    at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
    at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
    at io.grpc.census.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:494)
    at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:574)
    at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:72)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:742)
    at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:723)
    at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
    at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
    ... 3 more
liferoad commented 3 weeks ago

https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/1808 disables this for now.

sgorse123 commented 3 weeks ago

The original issue should be fixed in https://github.com/GoogleCloudPlatform/DataflowTemplates/pull/1792

I believe the issue was due to the rollout of a change that introduced new information schema views for Postgres dialect databases in our staging environment, but with a different type for the is_default column.

Should be fixed by the following logic: https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/e58646a1d11dfe5036b0719c5add1bd7f1d8a459/v1/src/main/java/com/google/cloud/teleport/spanner/ddl/InformationSchemaScanner.java#L1472-L1484

The second issue seems to be related to token indexes.

liferoad commented 3 weeks ago

This breaks our tests. So moved it to P1.