apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.89k stars 4.27k forks source link

[Bug]: Fail to retrieve rowcount for first arrow chunk: null. #24374

Open parakhjain opened 1 year ago

parakhjain commented 1 year ago

What happened?

while running the batch write operation using direct runner getting the error for Fail to retrieve rowcount for first arrow chunk: null.

Traceback (most recent call last): File "/home/parakh_jain/dataflow/snowflake_test.py", line 93, in run() File "/home/parakh_jain/dataflow/snowflake_test.py", line 90, in run run_read() File "/home/parakh_jain/dataflow/snowflake_test.py", line 86, in run_read result = p.run() File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/pipeline.py", line 574, in run return self.runner.run_pipeline(self, self._options) File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/direct/direct_runner.py", line 131, in run_pipeline return runner.run_pipeline(pipeline, options) File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 199, in run_pipeline self._latest_run_result = self.run_via_runner_api( File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 212, in run_via_runner_api return self.run_stages(stage_context, stages) File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 442, in run_stages bundle_results = self._execute_bundle( File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 770, in _execute_bundle self._run_bundle( File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 999, in _run_bundle result, splits = bundle_manager.process_bundle( File "/home/parakh_jain/dataflow/env_s/lib/python3.9/site-packages/apache_beam/runners/portability/fn_api_runner/fn_runner.py", line 1348, in process_bundle raise RuntimeError(result.error) RuntimeError: org.apache.beam.sdk.util.UserCodeException: net.snowflake.client.jdbc.SnowflakeSQLLoggedException: JDBC driver internal error: Fail to retrieve rowcount for first arrow chunk: null. at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39) at org.apache.beam.sdk.io.snowflake.SnowflakeIO$CopyToTableFn$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForParDo(FnApiDoFnRunner.java:801) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245) at org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1791) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$3000(FnApiDoFnRunner.java:145) at org.apache.beam.fn.harness.FnApiDoFnRunner$NonWindowObservingProcessBundleContext.outputWithTimestamp(FnApiDoFnRunner.java:2360) at org.apache.beam.fn.harness.FnApiDoFnRunner$ProcessBundleContextBase.output(FnApiDoFnRunner.java:2530) at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:78) at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:142) at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForParDo(FnApiDoFnRunner.java:801) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245) at org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1791) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$3000(FnApiDoFnRunner.java:145) at org.apache.beam.fn.harness.FnApiDoFnRunner$WindowObservingProcessBundleContext.outputWithTimestamp(FnApiDoFnRunner.java:2243) at org.apache.beam.fn.harness.FnApiDoFnRunner$ProcessBundleContextBase.output(FnApiDoFnRunner.java:2530) at org.apache.beam.sdk.transforms.Reify$ReifyView$1.process(Reify.java:60) at org.apache.beam.sdk.transforms.Reify$ReifyView$1$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForWindowObservingParDo(FnApiDoFnRunner.java:814) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245) at org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1791) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$3000(FnApiDoFnRunner.java:145) at org.apache.beam.fn.harness.FnApiDoFnRunner$WindowObservingProcessBundleContext.outputWithTimestamp(FnApiDoFnRunner.java:2243) at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.outputWithTimestamp(DoFnOutputReceivers.java:87) at org.apache.beam.sdk.io.Read$BoundedSourceAsSDFWrapperFn.processElement(Read.java:317) at org.apache.beam.sdk.io.Read$BoundedSourceAsSDFWrapperFn$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForWindowObservingSizedElementAndRestriction(FnApiDoFnRunner.java:1091) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$1500(FnApiDoFnRunner.java:145) at org.apache.beam.fn.harness.FnApiDoFnRunner$4.accept(FnApiDoFnRunner.java:658) at org.apache.beam.fn.harness.FnApiDoFnRunner$4.accept(FnApiDoFnRunner.java:653) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:313) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:245) at org.apache.beam.fn.harness.BeamFnDataReadRunner.forwardElementToConsumer(BeamFnDataReadRunner.java:213) at org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver2.multiplexElements(BeamFnDataInboundObserver2.java:158) at org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver2.awaitCompletion(BeamFnDataInboundObserver2.java:123) at org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:546) at org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:151) at org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:116) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at org.apache.beam.sdk.util.UnboundedScheduledExecutorService$ScheduledFutureTask.run(UnboundedScheduledExecutorService.java:162) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:833) Caused by: net.snowflake.client.jdbc.SnowflakeSQLLoggedException: JDBC driver internal error: Fail to retrieve row count for first arrow chunk: null. at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.setFirstChunkRowCountForArrow(SnowflakeResultSetSerializableV1.java:981) at net.snowflake.client.jdbc.SnowflakeResultSetSerializableV1.create(SnowflakeResultSetSerializableV1.java:516) at net.snowflake.client.core.SFResultSetFactory.getResultSet(SFResultSetFactory.java:28) at net.snowflake.client.core.SFStatement.executeQueryInternal(SFStatement.java:255) at net.snowflake.client.core.SFStatement.executeQuery(SFStatement.java:171) at net.snowflake.client.core.SFStatement.execute(SFStatement.java:754) at net.snowflake.client.jdbc.SnowflakeStatementV1.executeQueryInternal(SnowflakeStatementV1.java:245) at net.snowflake.client.jdbc.SnowflakePreparedStatementV1.executeQuery(SnowflakePreparedStatementV1.java:117) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.runStatement(SnowflakeBatchServiceImpl.java:286) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.runConnectionWithStatement(SnowflakeBatchServiceImpl.java:276) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.createTableIfNotExists(SnowflakeBatchServiceImpl.java:231) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.prepareTableAccordingCreateDisposition(SnowflakeBatchServiceImpl.java:204) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.copyToTable(SnowflakeBatchServiceImpl.java:135) at org.apache.beam.sdk.io.snowflake.services.SnowflakeBatchServiceImpl.write(SnowflakeBatchServiceImpl.java:48) at org.apache.beam.sdk.io.snowflake.SnowflakeIO$CopyToTableFn.processElement(SnowflakeIO.java:1262)

Issue Priority

Priority: 1

Issue Component

Component: io-java-snowflake

kennknowles commented 1 year ago

I'm not sure who is best to comment on the SnowflakeIO or its cross-language wrapper but I will ping @chamikaramj in case this is some generalized cross-language issue. Otherwise we can leave this here for Snowflake owners.

turb commented 2 months ago

I'm currently writing the scio wrapper for this (in Scala over Java lib), and ran into the same error.