microsoft / sql-spark-connector

Apache Spark Connector for SQL Server and Azure SQL
Apache License 2.0
273 stars 118 forks source link

Error - while writing to Sql DB #204

Closed akhilkoesis closed 1 year ago

akhilkoesis commented 2 years ago

Hi,

Getting following error while writing df to sql server -

org.apache.spark.SparkException: Job aborted due to stage failure: Task 27 in stage 3.0 failed 4 times, most recent failure: Lost task 27.3 in stage 3.0 (TID 202) (10.40.64.38 executor 0): com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.

Code -

url = server_name + ":1433;" + "databaseName=" + database_name + ";"

df.write \ .format("com.microsoft.sqlserver.jdbc.spark") \ .mode("overwrite") \ .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \ .option("url", url) \ .option("dbtable", table_name) \ .option("user", username) \ .option("password", password) \ .save()

moredatapls commented 2 years ago

it might be that your client is unable to connect if your server is under high load. you could consider increasing your login timeout:

url = server_name + ":1433;" + "databaseName=" + database_name + ";" + "loginTimeout=60;" (maybe even use a timeout >60)

but without knowing your exact setup and without having a full reproducible it's hard to know what is happening in your case

akhilkoesis commented 2 years ago

Hi, loginTimeout didn't work.

Below are the details - I'm trying to connect to Hyperscale: Standard-series (Gen5), 4 vCores Azure SQL DB, the above code is written in Azure data bricks in 10.4 LTS spark cluster (Also tried in 9.1 LTS cluster). All the parameters like server, user etc. are coming from Azure KeyVault.

I'm able to read the data from spark connector but not able to write. But now throwing below error (within 2 minutes).

The TCP/IP connection to the host [REDACTED]databaseName=[REDACTED], port 1433 has failed. Error: "[REDACTED]databaseName=[REDACTED]. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall."

Let me know for more details.

it might be that your client is unable to connect if your server is under high load. you could consider increasing your login timeout:

url = server_name + ":1433;" + "databaseName=" + database_name + ";" + "loginTimeout=60;" (maybe even use a timeout >60)

but without knowing your exact setup and without having a full reproducible it's hard to know what is happening in your case

moredatapls commented 2 years ago

The TCP/IP connection to the host [REDACTED]databaseName=[REDACTED], port 1433 has failed. Error: "[REDACTED]databaseName=[REDACTED]. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall."

@akhilkoesis are you running your SQL Server in Azure? If so, did you check whether the firewall allows incoming connections from your host that runs the code? If you're running the code locally: can you connect to the SQL Server instance manually, e.g. with SSMS?

akhilkoesis commented 2 years ago

The TCP/IP connection to the host [REDACTED]databaseName=[REDACTED], port 1433 has failed. Error: "[REDACTED]databaseName=[REDACTED]. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall."

@akhilkoesis are you running your SQL Server in Azure? If so, did you check whether the firewall allows incoming connections from your host that runs the code? If you're running the code locally: can you connect to the SQL Server instance manually, e.g. with SSMS?

Yes, It is Azure SQL DB. Yes firewall is allowing other JDBC/ODBC connectors. also, I'm able to connect with same functional account via SSMS.

moredatapls commented 2 years ago

Have to tried connecting to a different database? Or a different server entirely? Or maybe try another user?

Also, one thing I only noticed now: you don't need to specify the "driver" argument: .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver"), maybe try it again without it

copzyz123 commented 2 years ago

Have the same issue here. Not sure whether it is a bug in this driver, but When I switch back to the jdbc driver (remove the driver option) everything looks normal again.

moredatapls commented 2 years ago

@copzyz123 @akhilkoesis do you have a reproducible maybe, including some test data that you're uploading? otherwise it's hard to figure out what exactly is going wrong

rimelis commented 2 years ago

I can confirm the presence of this error. It's throwing after 1-2 seconds so there are no timeouts. Internal (and slow) JDBC drivers works normally.

copzyz123 commented 2 years ago

So I am using Synapse to run this and I have also installed the latest 1.2 jar package as well. The code looks like this image

Then I use a small dataframe to test it, like this: image

Finally I run the code and try appen 100 rows to the sql, then I get this: image

But if I choose to use default jdbc connector: image

It can successfully load the data and push to SQL: image

@moredatapls please let me know if that helps

moredatapls commented 2 years ago

@copzyz123 do you have the complete code including some sample data maybe? and could you also provide the details on how you setup SQL Server?

Regarding the reproducible: try to reduce the code to the bare minimum so that you can still produce this error. Remove all code on your side that might be involved in this issue and is not needed to reproduce the problem

image

regarding the picture you shared above: could you try to write the options like so instead of passing them with **options? could you share which options you are using exactly?

df.write \
  .format("com.microsoft.sqlserver.jdbc.spark") \
  .mode("overwrite") \
  .option("url", url) \
  .option("dbtable", table_name) \
  .option("user", username) \
  .option("password", password) \
  .save()
matteofaz commented 1 year ago

Hi, I am having the same issue. Able to read from the database, and write some small (dummy) dataframe, but when I try to write a bigger dataframe then it gives the same error as mentioned above: "connection closed".

DRB 10.4 connector version: 1.2.0 (latest)

Have you guys found something that can help?

matteofaz commented 1 year ago

@copzyz123 @moredatapls @rimelis @akhilkoesis it turned out I am having the issue while inserting timestamps into SQL.

Let me show, this is a working query (writing to the DB) image

If I replace current_timestamp with source.field_name (which is itself a timestamp column)

image

image

then the query crahes with the following error image

how do you think it could be possible? I am comparing timestamps formats but can't see differences

do you guys are using also timestamp columns? if yes, can you try the same?

(SAME CODE WITH OLD JDBC Connector WORKS!!)

luxu1-ms commented 1 year ago

@matteofaz The issue is becaue the datetime datatype. SQL Server Datetime data type only allows 3 digits fractional seconds while spark dateframe might have more digits. Please try to use datetime2 for SQL column instead of datetime. We do have a fix in spark which has not been taken by spark. https://github.com/apache/spark/pull/32655

matteofaz commented 1 year ago

@matteofaz The issue is becaue the datetime datatype. SQL Server Datetime data type only allows 3 digits fractional seconds while spark dateframe might have more digits. Please try to use datetime2 for SQL column instead of datetime. We do have a fix in spark which has not been taken by spark. apache/spark#32655

resolved, thanks!!