microsoft / sql-spark-connector

Apache Spark Connector for SQL Server and Azure SQL
Apache License 2.0
273 stars 116 forks source link

differing column nullable configurations when writing data that was immediately read #231

Open james-camacho-ab opened 1 year ago

james-camacho-ab commented 1 year ago

I'm attempting to test this library by reading data from our Azure SQL DB into a dataframe and immediately writing it back to the DB. The reading seems to be working fine, but when writing the data (using overwrite, which is necessary for our use cases) I have the following error: java.sql.SQLException: Spark Dataframe and SQL Server table have differing column nullable configurations at column index 0 DF col WholesalerNumber nullable config is true Table col WholesalerNumber nullable config is false

How can we resolve this?

Some additional code for context:

df = spark.read.format("com.microsoft.sqlserver.jdbc.spark").options(**options).load()
df.write.format("com.microsoft.sqlserver.jdbc.spark").mode("overwrite").options(**options).save()

where options is a Dictionary object containing key vault secrets for the DB connection

Running this within a Databricks environment with DBR 12.2 LTS / Spark 3.3.2 / Scala 2.12 Installed directly on the cluster via Maven com.microsoft.azure:spark-mssql-connector_2.12:1.3.0-BETA