microsoft / sql-spark-connector

Apache Spark Connector for SQL Server and Azure SQL
Apache License 2.0
273 stars 116 forks source link

Error while writing - com.microsoft.sqlserver.jdbc.SQLServerException: The connection Is closed. #238

Closed saurabh81192 closed 1 year ago

saurabh81192 commented 1 year ago

Below is the code which I am using. Do you think I am missing something here? The same code executes fine if I am using the write format as jdbc.

df.write \
  .format("com.microsoft.sqlserver.jdbc.spark") \
  .mode("overwrite") \  
  .option("truncate", "true") \
  .option("url", url) \
  .option("dbtable", Tablenamewithschema) \
  .option("user", user) \
  .option("password", password) \
  .option("reliabilityLevel", "BEST_EFFORT") \
  .option("tableLock", "True") \
  .option("isolationLevel", "True") \
  .option("batchsize", "100000") \
  .option("schemaCheckEnabled", "false") \
  .save()

Getting below error

   Error while Loading the data into a table -
    : org.apache.spark.SparkException: Job aborted due to stage failure: Task 28 in stage 2.0 failed 4 times, most recent failure: Lost task 28.3 in stage 2.0 (TID 54):
    com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.

I have created a new cluster with the below configuration and only one library is installed in that cluster which is from maven coordinate - com.microsoft.azure:spark-mssql-connector_2.12:1.2.0

8 Worker nodes with 14GB memory and 4 cores. Databricks Runtime Version - 9.1 LTS (includes Apache Spark 3.1.2, scala 2.12).

saurabh81192 commented 1 year ago

The issue is because of the datetime datatype. SQL Server Datetime data type only allows 3 digits fractional seconds while spark dateframe might have more digits. Please try to use datetime2 for SQL column instead of datetime.