aws-samples / aws-emr-advisor

EMR Advisor uses Spark Event Logs to generate insights and costs/runtime recommendations using different deployment options for Amazon EMR
MIT No Attribution
3 stars 2 forks source link

java.util.NoSuchElementException: key not found: TotalGCTime #5

Open liangjun-jiang opened 1 week ago

liangjun-jiang commented 1 week ago

I do wish this advisor works as advertised. I am seeing this follwoing error while submit this advisor spark app to EMR on EC2 to analysis the a SparkPi application's (spark-examples_2.12-3.1.1.jar) event log.

here is the part of Spark version etc

"JVM Information": {
        "Java Home": "/usr/lib/jvm/java-11-amazon-corretto.aarch64",
        "Java Version": "11.0.24 (Amazon.com Inc.)",
        "Scala Version": "version 2.12.15"
    },

"spark_version": "3.2.2"

4/11/15 17:59:49 ERROR ReplayListenerBus: Listener EmrSparkListener threw an exception java.util.NoSuchElementException: key not found: TotalGCTime at scala.collection.MapLike.default(MapLike.scala:236) at scala.collection.MapLike.default$(MapLike.scala:235) at scala.collection.AbstractMap.default(Map.scala:65) at scala.collection.MapLike.apply(MapLike.scala:144) at scala.collection.MapLike.apply$(MapLike.scala:143) at scala.collection.AbstractMap.apply(Map.scala:65) at org.apache.spark.executor.ExecutorMetrics.getMetricValue(ExecutorMetrics.scala:41) at com.amazonaws.emr.spark.models.metrics.AggExecutorMetrics.update(AggExecutorMetrics.scala:51) at com.amazonaws.emr.spark.models.timespan.ExecutorTimeSpan.updateMetrics(ExecutorTimeSpan.scala:25) at com.amazonaws.emr.spark.EmrSparkListener.onTaskEnd(EmrSparkListener.scala:230) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:45) at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28) at org.apache.spark.scheduler.ReplayListenerBus.doPostEvent(ReplayListenerBus.scala:35) at org.apache.spark.scheduler.ReplayListenerBus.doPostEvent(ReplayListenerBus.scala:35) at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:117) at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:101) at org.apache.spark.scheduler.ReplayListenerBus.postToAll(ReplayListenerBus.scala:35) at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:89) at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:60) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.amazonaws.emr.spark.EmrSparkLogParser.replayFile(EmrSparkLogParser.scala:91) at com.amazonaws.emr.spark.EmrSparkLogParser.process(EmrSparkLogParser.scala:58) at com.amazonaws.emr.SparkLogsAnalyzer$.delayedEndpoint$com$amazonaws$emr$SparkLogsAnalyzer$1(SparkLogsAnalyzer.scala:52) at com.amazonaws.emr.SparkLogsAnalyzer$delayedInit$body.apply(SparkLogsAnalyzer.scala:11) at scala.Function0.apply$mcV$sp(Function0.scala:39) at scala.Function0.apply$mcV$sp$(Function0.scala:39) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17) at scala.App.$anonfun$main$1$adapted(App.scala:80) at scala.collection.immutable.List.foreach(List.scala:431) at scala.App.main(App.scala:80) at scala.App.main$(App.scala:78) at com.amazonaws.emr.SparkLogsAnalyzer$.main(SparkLogsAnalyzer.scala:11) at com.amazonaws.emr.SparkLogsAnalyzer.main(SparkLogsAnalyzer.scala) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:740)

ripani-aws commented 1 day ago

Hi @liangjun-jiang I just tested a sample SparkPi application on an EMR 7.3.0 cluster and I don't see any problem with the report generation or eventLog parsing. Do you use any custom configuration in your EMR cluster? If yes, can you share the configurations that you're passing and provide more information on the release that you're using to reproduce the problem? Also can you confirm that the event logs analyzed were not truncated? typically the completed event logs terminates with a final event named SparkListenerApplicationEnd