Azure / spark-cdm-connector

MIT License
75 stars 32 forks source link

CDM DataBricks ERROR java.lang.RuntimeException: Manifest doesn't exist: model.json #73

Open Yeheyesl opened 3 years ago

Yeheyesl commented 3 years ago

readDf = (spark.read.format("com.microsoft.cdm") .option("storage",storageAccountName) .option("container",container) .option("entity", "account") .option("manifest", "/model.json") .option("appId", appID) .option("appKey", appKey) .option("tenantId",tenantID ).load())

readDf.select("*").show()

CDM jar using spark_cdm_connector_assembly_0_9.jar and spark_cdm_assembly_0_3_2.jar Spark using : 6.4 (includes Apache Spark 2.4.5, Scala 2.11)

I am getting below error ERROR java.lang.RuntimeException: Manifest doesn't exist: model.json

srichetar commented 3 years ago

You need to include this jar https://mvnrepository.com/artifact/com.microsoft.azure/spark-cdm-connector/0.19.1 It seems you are not including the correct jar - spark_cdm_connector_assembly_0_9.jar and spark_cdm_assembly_0_3_2.jar

Yeheyesl commented 3 years ago

Thank you I have included the correct jar now and I am now getting different error.

CDM jar using spark_cdm_connector_0_19_1.jar Spark using : 6.4 (includes Apache Spark 2.4.5, Scala 2.11)

SparkException: Job aborted due to stage failure: Task 0 in stage 6.0 failed 4 times, most recent failure: Lost task 0.3 in stage 6.0 (TID 27, 10.139.64.6, executor 1): java.io.InvalidClassException: com.microsoft.cdm.read.CDMInputPartition; local class incompatible: stream classdesc serialVersionUID = -4743836668915229848, local class serialVersionUID = -7896724514767731238 at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2003) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2160) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2405) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2329) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2187) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75) at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:505) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)

srichetar commented 2 years ago

Are you still getting this error? Please confirm