Closed moredatapls closed 1 year ago
If anyone finds this PR and wants support for Spark 3.3.0: head over to https://github.com/solytic/sql-spark-connector/releases/tag/v1.4.0 and use the build that we created at Solytic, since Microsoft seems to not be so active here
Even after the fix facing the below issue with Spark version 3.3 and scala version 2.12.15. Included the dependent libraries in build.sbt running on databricks runtime cluster 11.3 LTS
java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(Ljava/sql/ResultSet;Lorg/apache/spark/sql/jdbc/JdbcDialect;Z)Lorg/apache/spark/sql/types/StructType; at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.matchSchemas(BulkCopyUtils.scala:306) at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.getColMetaData(BulkCopyUtils.scala:267) at com.microsoft.sqlserver.jdbc.spark.Connector.write(Connector.scala:79)
Provided in build.sbt
name := "spark-mssql-connector"
organization := "com.microsoft.sqlserver.jdbc.spark"
version := "1.0.0"
scalaVersion := "2.12.15"
val sparkVersion = "3.3.0"
javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint")
initialize := { val _ = initialize.value val javaVersion = sys.props("java.specification.version") if (javaVersion != "1.8") sys.error("Java 1.8 is required for this project. Found " + javaVersion + " instead") }
scalacOptions := Seq("-deprecation", "-unchecked", "-Dscalac.patmat.analysisBudget=1024", "-Xfuture")
libraryDependencies ++= Seq( "com.microsoft.sqlserver" % "mssql-jdbc" % "8.4.1.jre8", "org.apache.spark" %% "spark-parent" % "3.3.0" % "provided", "org.scala-lang.modules" %% "scala-parser-combinators" % "1.1.2", "org.apache.spark" %% "spark-sql" % sparkVersion % "provided", "org.apache.spark" %% "spark-yarn" % sparkVersion, "org.scala-lang" % "scala-library" % "2.12.11" % "test", "org.apache.spark" %% "spark-core" % sparkVersion % "provided", "org.apache.spark" %% "spark-catalyst" % sparkVersion % "provided", "org.scalactic" %% "scalactic" % "3.2.6" % "test", "org.scalatest" %% "scalatest" % "3.2.6" % "test", "com.novocode" % "junit-interface" % "0.11" )
scalacOptions := Seq("-unchecked", "-deprecation", "evicted")
assemblyJarName in assembly := s"${name.value}${scalaBinaryVersion.value}-${sparkVersion}${version.value}.jar"
// Exclude scala-library from this fat jar. The scala library is already there in spark package. assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := { case PathList("META-INF",xs @ *) => MergeStrategy.discard case => MergeStrategy.first }
Am I missing anythings?
@arihar268 the project only contains a pom.xml, not a build.sbt. Where did you find this file?
If you want to upgrade the Spark and Scala version you need to do it there. See also the changed file in the PR: https://github.com/microsoft/sql-spark-connector/pull/197/files#diff-9c5fb3d1b7e3b0f54bc5c4182965c4fe1f9023d449017cece3005d3f90e8e4d8
@moredatapls thansk for PR. Please consider splitting these as 2 PRs. 1 for jdbc connection change,
Also did u run the regression test on spark 3.3.0 with this. If so, could you please add the test results here. Thanks again for work
@luxu-ms as FYI.
@moredatapls , I am using the same code which is in the PR. But I have build jar with pom.xml as well facing the similar issue in Databricks with runtime 11.3 LTS
I can see the same error that @arihar268 is seeing when using the build provided over at Solytic:
java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(Ljava/sql/ResultSet;Lorg/apache/spark/sql/jdbc/JdbcDialect;Z)Lorg/apache/spark/sql/types/StructType;
It only is happening when using truncate as write mode.
Could this be the case because StructType is no longer imported in src/main/scala/com/microsoft/sqlserver/jdbc/spark/utils/BulkCopyUtils.scala?
Is there any update on this? It is also a blocker for me and my colleagues to move to DBR 11.3
Its been 7 months since Spark 3.3 was released. @shivsood @luxu1-ms Any update on getting this PR merged?
Please! This is mandatory for DBR 11.3 and unity catalog! Thank you for your work!
Any update on this ?
@luxu1-ms @shivsood please check again
@luxu1-ms @shivsood please check again
@moredatapls Sorry closed this PR by accident. Thank you so much for the contribution and the updates. I reviewd this PR and left one comment. Please let me know if you have any opinions, then this PR will be good to go! We plan to have a new release very soon.
Please ensure test runs and attach results to PR
@shivsood what do you mean by this? doesn't one of you need to trigger the tests in CI? I can't do it
Merging. Test pass by @luxu1-ms
And when do we expect to see a final Version on Maven? https://mvnrepository.com/artifact/com.microsoft.azure/spark-mssql-connector_2.12
But I just get an Error it does not exist in Databricks. Currently I only see a 1.3.0-BETA Version? Only this BETA Version Works on Maven yet.
Is there any update on the final release of 1.3? I am not even considering using a beta version for production.
JdbcUtils.getConnectionFactory
with an own implementationFixes #191