microsoft / sql-spark-connector

Apache Spark Connector for SQL Server and Azure SQL
Apache License 2.0
273 stars 116 forks source link

User class threw exception: java.lang.NoSuchMethodError: org.apache.spark.sql.jdbc.JdbcDialect.createConnectionFactory #229

Open 2MD opened 1 year ago

2MD commented 1 year ago

Hello! I use spark 3.2.1 + spark-mssql-connector version 1.3.0-BETA.

i have MsSqlSparkConnector class and has code: val mapOptions = Map( "user" -> config.username, "password" -> config.password, "driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver", "url" -> "jdbc:sqlserver://**:1000;database=", "truncate" -> "true", "batchsize -> "10000", "reliabilityLevel" -> "BEST_EFFORT", "dbtable" -> "[stg].[sm_bonusmachine_user_type_raw]", "tableLock" -> "true", "schemaCheckEnabled" -> "false" )

df.write .format("com.microsoft.sqlserver.jdbc.spark") .options(mapOptions) .mode(SaveMode.Overwrite) .save()

I have error: 23/06/28 08:27:48 ERROR ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: org.apache.spark.sql.jdbc.JdbcDialect.createConnectionFactory(Lorg/apache/spark/sql/execution/datasources/jdbc/JDBCOptions;)Lscala/Function1; java.lang.NoSuchMethodError: org.apache.spark.sql.jdbc.JdbcDialect.createConnectionFactory(Lorg/apache/spark/sql/execution/datasources/jdbc/JDBCOptions;)Lscala/Function1; at com.microsoft.sqlserver.jdbc.spark.utils.JdbcUtils$.createConnection(JdbcUtils.scala:16) at com.microsoft.sqlserver.jdbc.spark.DefaultSource.createRelation(DefaultSource.scala:56) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:110) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:110) at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:106) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:106) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:93) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:91) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:128) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:848) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247) at ru.samokat.sdp.common.batch.connector.jdbc.JdbcWriter.writeInternal(JdbcWriter.scala:59) at ru.samokat.sdp.common.batch.connector.jdbc.JdbcWriter.writeInternal$(JdbcWriter.scala:57) at ru.samokat.sdp.common.batch.connector.jdbc.MsSqlSparkConnector.writeInternal(MsSqlSparkConnector.scala:13) at ru.samokat.sdp.common.batch.connector.jdbc.MsSqlSparkConnector.write(MsSqlSparkConnector.scala:32) at ru.samokat.sdp.common.batch.connector.jdbc.CommonReaderWriter.overwrite(CommonReaderWriter.scala:16) at ru.samokat.sdp.common.batch.connector.jdbc.CommonReaderWriter.overwrite$(CommonReaderWriter.scala:15) at ru.samokat.sdp.common.batch.connector.jdbc.MsSqlSparkConnector.overwrite(MsSqlSparkConnector.scala:13) at ru.samokat.sdp.rawods.sqlserverdwh.raw.common.dynamic.CommonRun.run(CommonRun.scala:138) at ru.samokat.sdp.rawods.sqlserverdwh.raw.common.dynamic.CommonRun.main(CommonRun.scala:24) at ru.samokat.sdp.rawods.sqlserverdwh.raw.common.dynamic.PostgresRawApp.main(PostgresRawApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737)

I tried : val dialect = JdbcDialects.get("jdbc:sqlserver://*:1000;database=") dialect.createConnectionFactory - this method doesn't exist here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/jdbc/MsSqlServerDialect.scala

Malcatrazz commented 1 year ago

Hi there,

I also ran into this problem. The issue is: "truncate" -> "true". The driver does not seem to support truncate or append. If you take that out the code should work. Since you are doing an overwrite (.mode(SaveMode.Overwrite)) you should be fine. Overwrite will drop and create the table so truncate doesn't add anything.

Cheers, Michel

2MD commented 1 year ago

Hi, it's help just for strategy : "reliabilityLevel" -> "BEST_EFFORT" but if i use "reliabilityLevel" -> "NO_DUPLICATES", i will be have an error: 23/07/03 09:19:13 INFO SingleInstanceConnector: Overwriting without truncate for table '[stg].[sm_bonusmachine_user_type_raw]' 23/07/03 09:19:13 INFO ReliableSingleInstanceStrategy: write : reliable write to single instance called 23/07/03 09:19:13 INFO CodeGenerator: Code generated in 147.031509 ms 23/07/03 09:19:13 ERROR ReliableSingleInstanceStrategy: cleanupStagingTables: Exception while dropping table [##application_167042629452569929[stg].[sm_bonusmachine_user_type_raw]_0] :Incorrect syntax near '_0'. 23/07/03 09:19:13 ERROR BulkCopyUtils: execute update failed with error Incorrect syntax near '_0'. 23/07/03 09:19:13 ERROR ReliableSingleInstanceStrategy: createStagingTables: Exception while creating table [##application_167042629452569929[stg].[sm_bonusmachine_user_type_raw]_0] : Incorrect syntax near '_0'. 23/07/03 09:19:13 ERROR ApplicationMaster: User class threw exception: com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '_0'. com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '_0'.

2MD commented 1 year ago

i think it like https://github.com/microsoft/sql-spark-connector/issues/194

alexey-abrosin commented 1 year ago

@2MD Had the same error as you. To avoid ...Incorrect syntax near '_0'. error you need to unwrap table name in dbtable option:

2MD commented 1 year ago

we can have table name like "1_c_name" it wont be correct without []