GoogleCloudDataproc / spark-bigquery-connector

BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.
Apache License 2.0
358 stars 189 forks source link

Indirect write to existing datetime column not possible #1232

Closed cheare closed 1 month ago

cheare commented 1 month ago

Hi,

I need to write spark TimestampNTZ column to an BigQuery table. The datetime type would be ideal, but I'm facing below error: "pyspark.errors.exceptions.captured.IllegalArgumentException: Data type not expected: timestamp_ntz".

image

When casting the column to StringType, the target column type changes to String (even in case of an existing, empty table). In case of non-empty target table there is: "com.google.cloud.bigquery.connector.common.BigQueryConnectorException$InvalidSchemaException: Destination table's schema is not compatible with dataframe's schema".

image

The prefered method for me is indirect write, but this issue occurs for the direct write as well.

Spark version 3.4.3 com.google.cloud.spark_spark-bigquery-with-dependencies_2.13-0.36.2.jar

isha97 commented 1 month ago

Hi @cheare ,

Please use the dsv2 connector for TimestampNTZ support com.google.cloud.spark:spark-3.4-bigquery:0.36.2

cheare commented 1 month ago

Thank you @isha97, with com.google.cloud.spark:spark-3.4-bigquery:0.36.2 all works fine :)