databricks / LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
https://learning.oreilly.com/library/view/learning-spark-2nd/9781492050032/
Apache License 2.0
1.17k stars 723 forks source link

Failed to load main class #64

Closed guzelcihad closed 3 years ago

guzelcihad commented 3 years ago

I am getting an error and didnt find a solution. I use Intellij, sbt 1.4.7, scala 2.12.10 and spark 3.0.

I couldn't submit any job locally. An example class I've work on:

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions.avg

object Aggregate extends App {

  val spark = SparkSession
    .builder()
    .appName("AuthorAges")
    .getOrCreate()

  val dataDF = spark.createDataFrame(Seq(("Brooke", 20), ("Brooke", 25),
    ("Denny", 31), ("Jules", 30), ("TD", 35))).toDF("name", "age")

  val avgDF = dataDF.groupBy("name").agg(avg("age"))
  avgDF.show()
}.

I use the command : $SPARK_HOME/bin/spark-submit --class main.scala.Aggregate /home/ubuntu/IdeaProjects/Deneme/target/Deneme-1.0-SNAPSHOT.jar

Neither I create project with sbt and maven. But I get sample error in both case.

My build.sbt file:

//name of the package
name := "main/scala"
//version of our package
version := "1.0"
//version of Scala
scalaVersion := "2.12.10"
// spark library dependencies
// change this to 3.0.0 when released
libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % "3.0.0",
  "org.apache.spark" %% "spark-sql"  % "3.0.0"
)

The Error 21/02/05 15:48:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Error: Failed to load class main.scala.Aggregate. log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Can anyone help me?

guzelcihad commented 3 years ago

Specifying main class like this --class Aggregate solved my problem.

TitoLulu commented 2 years ago

hey @guzelcihad I have run into the same error and despite specifying main class as you have suggested, it didn't work

GeetMonster commented 2 years ago

I am having same issue, I tried that as well it did not work.

My command is spark-submit --master yarn --num-executors 1 --executor-cores 1 --conf "spark.driver.extraJavaOptions=-Diop.version=4.1.0.0" --class com.package.MainClass Myjar.jar devmanaged dhs-userreview snapshot UPDATE_VERSION_VARIABLE 1.0-17 ""

guzelcihad commented 2 years ago

hey @guzelcihad I have run into the same error and despite specifying main class as you have suggested, it didn't work

@TitoLulu It's been a long time since I wrote this code. I can't help you right now :(