Azure / spark-cdm-connector

MIT License
75 stars 32 forks source link

Databricks Spark 3.x "doesn't work" / java.lang.NoClassDefFoundError: org/apache/spark/sql/sources/v2/ReadSupport #92

Closed pavanreddy88 closed 2 years ago

pavanreddy88 commented 2 years ago

Hi @asksparkcdm@microsoft.com I am from Linkedin, we are having compatibility issue with spark-cdm-connector, to give a little context I have a cdm data in ADLS which I’m trying to read into Databricks 9.1 LTS Apache Spark 3.1.2, Scala 2.12, I have installed com.microsoft.azure:spark-cdm-connector:0.19.1 and org.neo4j:neo4j-connector-apache-spark_2.12:4.1.2_for_spark_3, I tried multiple versions of neo4j-connector and spark-cdm-connector as well and its throwing this error ERROR: java.lang.NoClassDefFoundError:org/apache/spark/sql/sources/v2/ReadSupport.

Can you please provide proper version of spark-cdm-connector and neo4j-connector compatible for Databricks 9.1 LTS Apache Spark 3.1.2, Scala 2.12 from these below available versions? spark-cdm-connector versions available in Databricks CDM-connector

neo4j-connector available in Databricks Neo4j

Description automatically generated

ERROR context. Screen Shot 2022-05-12 at 4 23 41 PM

Description automatically generated

Pavan

bit007 commented 2 years ago

I am also facing similar issue while using 10.4 LTS (includes Apache Spark 3.2.1, Scala 2.12)

kcheeeung commented 2 years ago

The jar version 0.19.1 is for Spark 2.4 and you are using Spark 3, therefore giving class not found exception. The class ReadSupport is only found in Spark 2.4.x.

CDM Version Spark Version
0.x.x 2.4.x
1.x.x 3.1.x

If you want to use Databricks then, you need to build the jar. And with that only app registration works, but not credential passthrough. We haven't heard any workaround for credential passthrough with Databricks so the code is open sourced for contributions.

kecheung commented 1 year ago

Added to pinned issues https://github.com/Azure/spark-cdm-connector/issues/118