microsoft / sql-spark-connector

Apache Spark Connector for SQL Server and Azure SQL
Apache License 2.0
273 stars 116 forks source link

java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver #234

Open BertrandBrelier opened 1 year ago

BertrandBrelier commented 1 year ago

To whom it may concern,

Using the version spark-mssql-connector-1.3.0 with Spark 3.3.2 and Python 3.7.16, I added the spark-mssql-connector-1.3.0.jar to my /usr/lib/spark/jars/ directory.

I also added the options ,('spark.driver.extraClassPath', '/usr/lib/spark/jars/spark-mssql-connector-1.3.0.jar') ,('spark.executor.extraClassPath', '/usr/lib/spark/jars/spark-mssql-connector-1.3.0.jar') ,("spark.jars",'/usr/lib/spark/jars/spark-mssql-connector-1.3.0.jar')

to the spark configuration but when I tried to read data from an MSSQL database :

df = spark.read\ .format("jdbc")\ .option("url","jdbc:sqlserver://XXXXXXX:1433;databaseName=XXXXX;")\ .option("user", username)\ .option("password", password)\ .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")\ .option("dbtable", "XXXXX")\ .load()

Py4JJavaError: An error occurred while calling o391.load. : java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver

I also tried the same method provided in the example: jdbcDF = spark.read \ .format("com.microsoft.sqlserver.jdbc.spark") \ .option("url", url) \ .option("dbtable", table_name) \ .option("user", username) \ .option("password", password).load()

Py4JJavaError: An error occurred while calling o383.load. : java.sql.SQLException: No suitable driver

Could you please help me setup the spark-mssql-connector-1.3.0 with spark 3.3.2 ?

Thank you for your help,

BertrandBrelier commented 1 year ago

Hello everybody,

I was able to load a table from a Microsoft SQL database to pyspark (Spark 3.3.2 ) using this code:

` import findspark findspark.init() import pyspark from pyspark.sql.session import SparkSession spark = SparkSession.builder.appName("FirstRun")\ .config("spark.jars.packages", "com.microsoft.azure:spark-mssql-connector_2.12:1.2.0")\ .getOrCreate()

server_name = "jdbc:sqlserver://NAMEOFSQLSERVER:1433" database_name = "NAMEOFDATABASE" url = server_name + ";" + "databaseName=" + database_name + ";" table_name = "dbo.NAMEOFTABLE" jdbcDF = spark.read \ .format("com.microsoft.sqlserver.jdbc.spark") \ .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")\ .option("url", url) \ .option("dbtable", table_name) \ .option("user", username) \ .option("password", password).load() `

Thank you,