aalkilani / spark-kafka-cassandra-applying-lambda-architecture

Other
64 stars 52 forks source link

Facing a heap issue in BatchJob.scala script #34

Open nolwenbrossonapps opened 4 years ago

nolwenbrossonapps commented 4 years ago

Hello,

I am having a heap memory issue in this part of the course. My only solution was to downgrade to version 1.5 of Spark. But then, I had other errors...

JDK: 1.8 Scala: 2.11.7 Spark: 1.6

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/04/30 17:06:54 INFO SparkContext: Running Spark version 1.6.0
20/04/30 17:06:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/04/30 17:06:55 INFO SecurityManager: Changing view acls to: Nolwen.Brosson
20/04/30 17:06:55 INFO SecurityManager: Changing modify acls to: Nolwen.Brosson
20/04/30 17:06:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Nolwen.Brosson); users with modify permissions: Set(Nolwen.Brosson)
20/04/30 17:06:56 INFO Utils: Successfully started service 'sparkDriver' on port 57698.
20/04/30 17:06:57 INFO Slf4jLogger: Slf4jLogger started
20/04/30 17:06:57 INFO Remoting: Starting remoting
20/04/30 17:06:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.1.5.175:57711]
20/04/30 17:06:57 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 57711.
20/04/30 17:06:57 INFO SparkEnv: Registering MapOutputTracker
20/04/30 17:06:57 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 4.718592E8. Please use a larger heap size.
    at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
    at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
    at batch.BatchJob$.main(BatchJob.scala:23)
    at batch.BatchJob.main(BatchJob.scala)
20/04/30 17:06:57 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.IllegalArgumentException: System memory 259522560 must be at least 4.718592E8. Please use a larger heap size.
    at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
    at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
    at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
    at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
    at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
    at batch.BatchJob$.main(BatchJob.scala:23)
    at batch.BatchJob.main(BatchJob.scala)

I already tried different configurations: JAVA_OPTS=-Xms128m -Xmx512m in my system variables

In Intellij:

But nothing works... Any ideas how I could do?

Can you confirm that this code should not be run under the virtual machine that we set up at the beginning of the course? Also, can you confirm that a manual instalation of Spark on my computer is not necessary at all?

nolwenbrossonapps commented 4 years ago

Finally could find a solution:

add -Xms2g -Xmx4g in VM options directly in Intellij Scala Console.

That's the only thing that worked for me. It may be helpful for some other people !