apache / datafusion-comet

Apache DataFusion Comet Spark Accelerator
https://datafusion.apache.org/comet
Apache License 2.0
809 stars 159 forks source link

Running Spark Shell with Comet throws Exception #872

Open radhikabajaj123 opened 2 months ago

radhikabajaj123 commented 2 months ago

Hello,

I am trying to run the spark shell with comet enabled following the configurations specified at https://datafusion.apache.org/comet/user-guide/installation.html#installing-datafusion-comet, after creating a local build.

I was previously able to launch the spark shell successfully after cloning the Comet project, however, it now gives this exception:

at org.apache.spark.SparkContext.addLocalJarFile$1(SparkContext.scala:1968) at org.apache.spark.SparkContext.addJar(SparkContext.scala:2023) at org.apache.spark.SparkContext.$anonfun$new$12(SparkContext.scala:507) at org.apache.spark.SparkContext.$anonfun$new$12$adapted(SparkContext.scala:507) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561) at scala.collection.AbstractIterable.foreach(Iterable.scala:926) at org.apache.spark.SparkContext.<init>(SparkContext.scala:507) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2740) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1026) at scala.Option.getOrElse(Option.scala:201) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1020) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:114) at $line3.$read$$iw.<init>(<console>:5) at $line3.$read.<init>(<console>:4) at $line3.$read$.<clinit>(<console>:1) at $line3.$eval$.$print$lzycompute(<synthetic>:6) at $line3.$eval$.$print(<synthetic>:5) at $line3.$eval.$print(<synthetic>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:670) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$1(IMain.scala:506) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:43) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:505) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$3(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.doInterpret(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:503) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:501) at scala.tools.nsc.interpreter.IMain.$anonfun$quietRun$1(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.IMain.quietRun(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$interpretPreamble$1(ILoop.scala:924) at scala.collection.immutable.List.foreach(List.scala:333) at scala.tools.nsc.interpreter.shell.ILoop.interpretPreamble(ILoop.scala:924) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$3(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ILoop.echoOff(ILoop.scala:90) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$2(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.IMain.withSuppressedSettings(IMain.scala:1420) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$1(ILoop.scala:954) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.shell.ILoop.run(ILoop.scala:954) at org.apache.spark.repl.Main$.doMain(Main.scala:84) at org.apache.spark.repl.Main$.main(Main.scala:59) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 24/08/26 17:08:39 ERROR SparkContext: Error initializing SparkContext. java.lang.ClassNotFoundException: org.apache.spark.CometPlugin at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:75) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:2946) at scala.collection.StrictOptimizedIterableOps.flatMap(StrictOptimizedIterableOps.scala:118) at scala.collection.StrictOptimizedIterableOps.flatMap$(StrictOptimizedIterableOps.scala:105) at scala.collection.immutable.ArraySeq.flatMap(ArraySeq.scala:35) at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2944) at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:207) at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:193) at org.apache.spark.SparkContext.<init>(SparkContext.scala:565) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2740) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1026) at scala.Option.getOrElse(Option.scala:201) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1020) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:114) at $line3.$read$$iw.<init>(<console>:5) at $line3.$read.<init>(<console>:4) at $line3.$read$.<clinit>(<console>:1) at $line3.$eval$.$print$lzycompute(<synthetic>:6) at $line3.$eval$.$print(<synthetic>:5) at $line3.$eval.$print(<synthetic>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:670) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$1(IMain.scala:506) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:43) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:505) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$3(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.doInterpret(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:503) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:501) at scala.tools.nsc.interpreter.IMain.$anonfun$quietRun$1(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.IMain.quietRun(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$interpretPreamble$1(ILoop.scala:924) at scala.collection.immutable.List.foreach(List.scala:333) at scala.tools.nsc.interpreter.shell.ILoop.interpretPreamble(ILoop.scala:924) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$3(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ILoop.echoOff(ILoop.scala:90) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$2(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.IMain.withSuppressedSettings(IMain.scala:1420) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$1(ILoop.scala:954) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.shell.ILoop.run(ILoop.scala:954) at org.apache.spark.repl.Main$.doMain(Main.scala:84) at org.apache.spark.repl.Main$.main(Main.scala:59) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 24/08/26 17:08:39 INFO SparkContext: SparkContext is stopping with exitCode 0. 24/08/26 17:08:39 INFO SparkUI: Stopped Spark web UI at http://n-chafqtlvmeh6ad1j7j7f3.workdaysuv.com:4040 24/08/26 17:08:39 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 24/08/26 17:08:39 INFO MemoryStore: MemoryStore cleared 24/08/26 17:08:39 INFO BlockManager: BlockManager stopped 24/08/26 17:08:39 INFO BlockManagerMaster: BlockManagerMaster stopped 24/08/26 17:08:39 WARN MetricsSystem: Stopping a MetricsSystem that is not running 24/08/26 17:08:39 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 24/08/26 17:08:39 INFO SparkContext: Successfully stopped SparkContext 24/08/26 17:08:39 ERROR Main: Failed to initialize Spark session. java.lang.ClassNotFoundException: org.apache.spark.CometPlugin at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:75) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:225) at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:2946) at scala.collection.StrictOptimizedIterableOps.flatMap(StrictOptimizedIterableOps.scala:118) at scala.collection.StrictOptimizedIterableOps.flatMap$(StrictOptimizedIterableOps.scala:105) at scala.collection.immutable.ArraySeq.flatMap(ArraySeq.scala:35) at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2944) at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:207) at org.apache.spark.internal.plugin.PluginContainer$.apply(PluginContainer.scala:193) at org.apache.spark.SparkContext.<init>(SparkContext.scala:565) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2740) at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:1026) at scala.Option.getOrElse(Option.scala:201) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1020) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:114) at $line3.$read$$iw.<init>(<console>:5) at $line3.$read.<init>(<console>:4) at $line3.$read$.<clinit>(<console>:1) at $line3.$eval$.$print$lzycompute(<synthetic>:6) at $line3.$eval$.$print(<synthetic>:5) at $line3.$eval.$print(<synthetic>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:670) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$1(IMain.scala:506) at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36) at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:43) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:505) at scala.tools.nsc.interpreter.IMain.$anonfun$doInterpret$3(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.doInterpret(IMain.scala:519) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:503) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:501) at scala.tools.nsc.interpreter.IMain.$anonfun$quietRun$1(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.IMain.quietRun(IMain.scala:216) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$interpretPreamble$1(ILoop.scala:924) at scala.collection.immutable.List.foreach(List.scala:333) at scala.tools.nsc.interpreter.shell.ILoop.interpretPreamble(ILoop.scala:924) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$3(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ILoop.echoOff(ILoop.scala:90) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$2(ILoop.scala:963) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.IMain.withSuppressedSettings(IMain.scala:1420) at scala.tools.nsc.interpreter.shell.ILoop.$anonfun$run$1(ILoop.scala:954) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at scala.tools.nsc.interpreter.shell.ReplReporterImpl.withoutPrintingResults(Reporter.scala:64) at scala.tools.nsc.interpreter.shell.ILoop.run(ILoop.scala:954) at org.apache.spark.repl.Main$.doMain(Main.scala:84) at org.apache.spark.repl.Main$.main(Main.scala:59) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1020) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1111) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

nitin-kalyankar25 commented 2 months ago

@radhikabajaj123 : I recently installed the setup successfully. Can you write down the steps you followed to understand the issue?

radhikabajaj123 commented 2 months ago

Hi @nitin-kalyankar25 ,

I followed the steps on this page: https://datafusion.apache.org/comet/user-guide/installation.html#building-from-source (previously, the same steps worked, but now they're not working).

  1. git clone https://github.com/apache/datafusion-comet.git
  2. Installed rustup as listed on this page: https://datafusion.apache.org/comet/contributor-guide/development.html#:~:text=Install%20Rust%20toolchain.%20The%20easiest%20way%20is%20to%20use%20rustup.
  3. Created a build: make release PROFILES="-Pspark-3.4 -Pscala-2.13"
  4. Specified path to $COMET_JAR and $SPARK_HOME
  5. Attempted to launch the shell and received the above error: SPARK_HOME/bin/spark-shell \ --jars $COMET_JAR \ --conf spark.driver.extraClassPath=$COMET_JAR \ --conf spark.executor.extraClassPath=$COMET_JAR \ --conf spark.plugins=org.apache.spark.CometPlugin \ --conf spark.comet.enabled=true \ --conf spark.comet.exec.enabled=true \ --conf spark.comet.explainFallback.enabled=true \ --conf spark.driver.memory=1g \ --conf spark.executor.memory=1g
nitin-kalyankar25 commented 2 months ago

@radhikabajaj123 : As your error indicates, initialize Spark session. java.lang.ClassNotFoundException: org.apache.spark.CometPlugin, the CometPlugin class is not being found. This typically happens when the JAR containing the class is either missing or not properly referenced in your project.

  1. Check the JAR Path: Ensure that you are passing the correct path to the JAR file and jar should be *-SNAPSHOT.jar.
  2. Clean Existing Builds: Try deleting/cleaning existing JAR if it is partially created.
  3. Build with Profiles: Rebuild your project with the following command

    make release PROFILES="-Pspark-3.4 -Pscala-2.13 Below command worked fine with me

    export COMET_JAR=<*-SNAPSHOT.jar>
    ./spark-shell \
    --jars $COMET_JAR \
    --conf spark.driver.extraClassPath=$COMET_JAR \
    --conf spark.executor.extraClassPath=$COMET_JAR \
    --conf spark.plugins=org.apache.spark.CometPlugin \
    --conf spark.comet.enabled=true \
    --conf spark.comet.exec.enabled=true \
    --conf spark.comet.explainFallback.enabled=true \
    --conf spark.comet.exec.shuffle.mode=jvm \
    --conf spark.executor.memory=1g \
    --conf spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager \
    --conf spark.comet.exec.shuffle.enabled=true
radhikabajaj123 commented 2 months ago

@nitin-kalyankar25 Deleting the datafusion-comet project and then repeating the steps from before should Clean Existing Builds (2), right?

nitin-kalyankar25 commented 2 months ago

@radhikabajaj123 : Not exactly. Try deleting the JAR files from the /spark/target/ directory, where the JAR is created after running the build command.

radhikabajaj123 commented 2 months ago

@nitin-kalyankar25 Hmmm, but when I delete the datafusion-comet project and clone it again, it doesn't contain any /spark/target directory.

It's when I run make release PROFILES="-Pspark-3.4 -Pscala-2.13" again, it seems as if it rebuilds the jars and creates the target directory.

The path to the JAR is also correct.

It now gives me this error:

Error: A JNI error has occurred, please check your installation and try again Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetMethodRecursive(Class.java:3048) at java.lang.Class.getMethod0(Class.java:3018) at java.lang.Class.getMethod(Class.java:1784) at sun.launcher.LauncherHelper.validateMainClass(LauncherHelper.java:544) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:526) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 7 more

nitin-kalyankar25 commented 2 months ago

@radhikabajaj123 : Which Spark shell and Scala version are you using in your Spark setup?

radhikabajaj123 commented 2 months ago

@nitin-kalyankar25 Spark 3.4.3 and Scala 2.13

nblagodarnyi commented 2 months ago

I faced the same issue when tried to set hdfs locations for extra classpaths like spark.[driver|executor].extraClassPath=hdfs:///foo/bar/comet.jar AFAIU it only supports local files and spark-submit silently ignores errors (like missing jars or comma-separated list instead of semicolon-separated) in mentioned config. Also note that this config overrides settings in conf/spark-defaults. This is a possible cause of issue with FSDataInputStream.