Exception parsing Spark event log

BMacster commented 7 years ago

Two recent compiles, one using sbt.version 0.13.9 and the other using 0.13.2 in build.properties. Compiled on centos7 with activator 1.3.12 Cluster is ambari hdp stack 2.6.1 Both have the below values in compile.conf hadoop_version=2.6.1 spark_version=2.1.1 play_opts="-Dsbt.repository.config=app-conf/resolver.conf"

These are two separate compilations on two separate boxes that are members of the same Ambari cluster structure (hdp stack 2.6.1) and both hosts fire the error below. Note that I am only executing included spark test jobs (word count of war-and-peace and sparkpi)

[error] o.a.s.s.ReplayListenerBus - Exception parsing Spark event log: application_1502904041499_0358 org.json4s.package$MappingException: Did not find value which can be converted into boolean at org.json4s.reflect.package$.fail(package.scala:96) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.convert(Extraction.scala:554) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:331) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:42) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.apache.spark.util.JsonProtocol$.storageLevelFromJson(JsonProtocol.scala:826) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0] [error] o.a.s.s.ReplayListenerBus - Malformed line #21: {"Event":"SparkListenerJobStart","Job ID":0,"Submission Time":1503944198210,"Stage Infos":[{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"reduce at SparkPi.scala:38","Number of Tasks":100000,"RDD Info":[{"RDD ID":1,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"1\",\"name\":\"map\"}","Callsite":"map at SparkPi.scala:34","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":100000,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"ParallelCollectionRDD","Scope":"{\"id\":\"0\",\"name\":\"parallelize\"}","Callsite":"parallelize at SparkPi.scala:34","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":100000,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.rdd.RDD.reduce(RDD.scala:1008)\norg.apache.spark.examples.SparkPi$.main(SparkPi.scala:38)\norg.apache.spark.examples.SparkPi.main(SparkPi.scala)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\norg.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:750)\norg.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)\norg.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)\norg.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)\norg.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","Accumulables":[]}],"Stage IDs":[0],"Properties":{"spark.rdd.scope.noOverride":"true","spark.rdd.scope":"{\"id\":\"2\",\"name\":\"reduce\"}"}}

BMacster commented 7 years ago

The above stack trace was a sparkpi execution. Below is a stack trace from the oozie scheduled word count example run. [error] o.a.s.s.ReplayListenerBus - Exception parsing Spark event log: application_1502904041499_0360_1 org.json4s.package$MappingException: Did not find value which can be converted into boolean at org.json4s.reflect.package$.fail(package.scala:96) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.convert(Extraction.scala:554) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:331) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.Extraction$.extract(Extraction.scala:42) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10] at org.apache.spark.util.JsonProtocol$.storageLevelFromJson(JsonProtocol.scala:826) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0] [error] o.a.s.s.ReplayListenerBus - Malformed line #37: {"Event":"SparkListenerJobStart","Job ID":0,"Submission Time":1503946845694,"Stage Infos":[{"Stage ID":0,"Stage Attempt ID":0,"Stage Name":"reduceByKey at wc-war-and-peace.py:18","Number of Tasks":20,"RDD Info":[{"RDD ID":3,"Name":"PairwiseRDD","Callsite":"reduceByKey at wc-war-and-peace.py:18","Parent IDs":[2],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":1,"Name":"hdfs:///war-and-peace.txt","Scope":"{\"id\":\"0\",\"name\":\"textFile\"}","Callsite":"textFile at NativeMethodAccessorImpl.java:0","Parent IDs":[0],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":2,"Name":"PythonRDD","Callsite":"reduceByKey at wc-war-and-peace.py:18","Parent IDs":[1],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":0,"Name":"hdfs:///war-and-peace.txt","Scope":"{\"id\":\"0\",\"name\":\"textFile\"}","Callsite":"textFile at NativeMethodAccessorImpl.java:0","Parent IDs":[],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[],"Details":"org.apache.spark.rdd.RDD.<init>(RDD.scala:104)\norg.apache.spark.api.python.PairwiseRDD.<init>(PythonRDD.scala:386)\nsun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)\nsun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)\nsun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)\njava.lang.reflect.Constructor.newInstance(Constructor.java:423)\npy4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)\npy4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\npy4j.Gateway.invoke(Gateway.java:236)\npy4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)\npy4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)\npy4j.GatewayConnection.run(GatewayConnection.java:214)\njava.lang.Thread.run(Thread.java:745)","Accumulables":[]},{"Stage ID":1,"Stage Attempt ID":0,"Stage Name":"saveAsTextFile at NativeMethodAccessorImpl.java:0","Number of Tasks":20,"RDD Info":[{"RDD ID":8,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"4\",\"name\":\"saveAsTextFile\"}","Callsite":"saveAsTextFile at NativeMethodAccessorImpl.java:0","Parent IDs":[7],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":6,"Name":"PythonRDD","Callsite":"RDD at PythonRDD.scala:48","Parent IDs":[5],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":4,"Name":"ShuffledRDD","Scope":"{\"id\":\"1\",\"name\":\"partitionBy\"}","Callsite":"partitionBy at NativeMethodAccessorImpl.java:0","Parent IDs":[3],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":5,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"2\",\"name\":\"mapPartitions\"}","Callsite":"mapPartitions at PythonRDD.scala:422","Parent IDs":[4],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0},{"RDD ID":7,"Name":"MapPartitionsRDD","Scope":"{\"id\":\"3\",\"name\":\"map\"}","Callsite":"map at NativeMethodAccessorImpl.java:0","Parent IDs":[6],"Storage Level":{"Use Disk":false,"Use Memory":false,"Deserialized":false,"Replication":1},"Number of Partitions":20,"Number of Cached Partitions":0,"Memory Size":0,"Disk Size":0}],"Parent IDs":[0],"Details":"org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\npy4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\npy4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)\npy4j.Gateway.invoke(Gateway.java:280)\npy4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\npy4j.commands.CallCommand.execute(CallCommand.java:79)\npy4j.GatewayConnection.run(GatewayConnection.java:214)\njava.lang.Thread.run(Thread.java:745)","Accumulables":[]}],"Stage IDs":[0,1],"Properties":{"spark.rdd.scope.noOverride":"true","spark.rdd.scope":"{\"id\":\"4\",\"name\":\"saveAsTextFile\"}"}}

BMacster commented 7 years ago

Looks like a call to storageLevelFromJson does val useExternalBlockStore = (json \ "Use ExternalBlockStore").toSome .getOrElse(json \ "Use Tachyon").extract[Boolean] This appears to be an older version of org.apache.spark.util.JsonProtocol in use by org.apache.spark.scheduler.

Not sure where to go from here...

i-mine commented 6 years ago

so,do you solve that, I suffered the same

andrijaperovic commented 6 years ago

Seeing similar issues building against spark 2.2 hadoop 2.6.0 (hadoop_version=2.3.0 spark_version=1.4.0 in compile.conf):

2018-02-24 12:01:48,577 - [ERROR] - from org.apache.spark.scheduler.ReplayListenerBus in ForkJoinPool-1-worker-9
Exception parsing Spark event log: application_1511410366505_3757
org.json4s.package$MappingException: Did not find value which can be converted into boolean
        at org.json4s.reflect.package$.fail(package.scala:96) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10]
        at org.json4s.Extraction$.convert(Extraction.scala:554) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10]
        at org.json4s.Extraction$.extract(Extraction.scala:331) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10]
        at org.json4s.Extraction$.extract(Extraction.scala:42) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10]
        at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:21) ~[org.json4s.json4s-core_2.10-3.2.10.jar:3.2.10]
        at org.apache.spark.util.JsonProtocol$.storageLevelFromJson(JsonProtocol.scala:826) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0]
        at org.apache.spark.util.JsonProtocol$.rddInfoFromJson(JsonProtocol.scala:804) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0]
        at org.apache.spark.util.JsonProtocol$$anonfun$51.apply(JsonProtocol.scala:608) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0]
        at org.apache.spark.util.JsonProtocol$$anonfun$51.apply(JsonProtocol.scala:608) ~[org.apache.spark.spark-core_2.10-1.4.0.jar:1.4.0]
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) ~[org.scala-lang.scala-library-2.10.4.jar:na]
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) ~[org.scala-lang.scala-library-2.10.4.jar:na]
        at scala.collection.immutable.List.foreach(List.scala:318) ~[org.scala-lang.scala-library-2.10.4.jar:na]
...
Malformed line #5:{  
   "Event":"SparkListenerJobStart",
   "Job ID":0,
   "Submission Time":1519226641692,
   "Stage Infos":[  
      {  
         "Stage ID":0,
         "Stage Attempt ID":0,
         "Stage Name":"insertInto at StageToDM.scala:80",
         "Number of Tasks":44,
         "RDD Info":[  
            {  
               "RDD ID":2,
               "Name":"MapPartitionsRDD",
               "Scope":"{\"id\":\"0\",\"name\":\"ExecutedCommand\"}",
               "Callsite":"insertInto at StageToDM.scala:80",
               "Parent IDs":[  
                  1
               ],
               "Storage Level":{  
                  "Use Disk":false,
                  "Use Memory":false,
                  "Deserialized":false,
                  "Replication":1
               },
               "Number of Partitions":44,
               "Number of Cached Partitions":0,
               "Memory Size":0,
               "Disk Size":0
            },
            {  
               "RDD ID":1,
               "Name":"MapPartitionsRDD",
               "Scope":"{\"id\":\"0\",\"name\":\"ExecutedCommand\"}",
               "Callsite":"insertInto at StageToDM.scala:80",
               "Parent IDs":[  
                  0
               ],
               "Storage Level":{  
                  "Use Disk":false,
                  "Use Memory":false,
                  "Deserialized":false,
                  "Replication":1
               },
               "Number of Partitions":44,
               "Number of Cached Partitions":0,
               "Memory Size":0,
               "Disk Size":0
            },
            {  
               "RDD ID":0,
               "Name":"ParallelCollectionRDD",
               "Scope":"{\"id\":\"0\",\"name\":\"ExecutedCommand\"}",
               "Callsite":"insertInto at StageToDM.scala:80",
               "Parent IDs":[  

               ],
               "Storage Level":{  
                  "Use Disk":false,
                  "Use Memory":false,
                  "Deserialized":false,
                  "Replication":1
               },
               "Number of Partitions":44,
               "Number of Cached Partitions":0,
               "Memory Size":0,
               "Disk Size":0
            }
         ],
         "Parent IDs":[  

         ],
         "Details":"org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:269)\ncom.calliduscloud.thunderbridge.analytics.spark.StageToDM$.main(StageToDM.scala:80)\ncom.calliduscloud.thunderbridge.analytics.spark.StageToDM.main(StageToDM.scala)\nsun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\nsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\nsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\njava.lang.reflect.Method.invoke(Method.java:498)\norg.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)\norg.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)\norg.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)\norg.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)\norg.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)",
         "Accumulables":[  

         ]
      }
   ],
   "Stage IDs":[  
      0
   ],
   "Properties":{  
      "spark.rdd.scope.noOverride":"true",
      "spark.rdd.scope":"{\"id\":\"0\",\"name\":\"ExecutedCommand\"}"
   }
}

The query which is being evaluated for reference:

Malformed line #5: 

Malformed line #5: {"Event":"org.apache.spark.sql.execution.ui.SparkListenerSQLExecutionStart","executionId":0,"description":"insertInto at StageToDM.scala:80","details":"org.apache.spark.sql.DataFrameWriter.insertInto(DataFrameWriter.scala:269)
com.calliduscloud.thunderbridge.analytics.spark.StageToDM$.main(StageToDM.scala:80)
com.calliduscloud.thunderbridge.analytics.spark.StageToDM.main(StageToDM.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)","physicalPlanDescription":"== Parsed Logical Plan ==
Project [cast((CAST(row_number() OVER (ORDER BY tenantid ASC NULLS FIRST, salestransaction_sk ASC NULLS FIRST, payeeSeq ASC NULLS FIRST, positionSeq ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS BIGINT) + max_sk)#479L as bigint) AS patransaction_sk#1086L, pa_sk#388L, cast(salesTransactionSeq#262L as bigint) AS salestransactionseq#1087L, cast(salesOrderSeq#263L as bigint) AS salesorderseq#1088L, cast(2533274790396085#478L as bigint) AS periodseq#1089L, cast(payeeSeq#392L as bigint) AS payeeseq#1090L, cast(positionSeq#393L as bigint) AS positionseq#1091L, cast(orderId#265 as string) AS orderid#1092, cast(lineNumber#266L as bigint) AS linenumber#1093L, cast(unitNameForLineNumber#267 as string) AS unitnameforlinenumber#1094, cast(unitClassForLineNumber#268 as string) AS unitclassforlinenumber#1095, cast(subLineNumber#269L as bigint) AS sublinenumber#1096L, cast(unitNameForSubLineNumber#270 as string) AS unitnameforsublinenumber#1097, cast(unitClassForSubLineNumber#271 as string) AS unitclassforsublinenumber#1098, cast(eventTypeId#272 as string) AS eventtypeid#1099, cast(originTypeId#273 as string) AS origintypeid#1100, cast(compensationDate#274 as timestamp) AS compensationdate#1101, cast(compensationDate_sk#275L as bigint) AS compensationdate_sk#1102L, cast(billToaddressseq#276L as bigint) AS billtoaddressseq#1103L, cast(billToCustid#277 as string) AS billtocustid#1104, cast(shipToAddressseq#278L as bigint) AS shiptoaddressseq#1105L, cast(shipToCustid#279 as string) AS shiptocustid#1106, cast(otherToAddressseq#280L as bigint) AS othertoaddressseq#1107L, cast(otherToCustid#281 as string) AS othertocustid#1108, ... 105 more fields]
+- Project [(CAST(row_number() OVER (ORDER BY tenantid ASC NULLS FIRST, salestransaction_sk ASC NULLS FIRST, payeeSeq ASC NULLS FIRST, positionSeq ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS BIGINT) + max_sk)#479L, pa_sk#388L, salesTransactionSeq#262L, salesOrderSeq#263L, 2533274790396085#478L, payeeSeq#392L, positionSeq#393L, orderId#265, lineNumber#266L, unitNameForLineNumber#267, unitClassForLineNumber#268, subLineNumber#269L, unitNameForSubLineNumber#270, unitClassForSubLineNumber#271, eventTypeId#272, originTypeId#273, compensationDate#274, compensationDate_sk#275L, billToaddressseq#276L, billToCustid#277, shipToAddressseq#278L, shipToCustid#279, otherToAddressseq#280L, otherToCustid#281, ... 105 more fields]
   +- Project [pa_sk#388L, salesTransactionSeq#262L, salesOrderSeq#263L, 2533274790396085#478L, payeeSeq#392L, positionSeq#393L, orderId#265, lineNumber#266L, unitNameForLineNumber#267, unitClassForLineNumber#268, subLineNumber#269L, unitNameForSubLineNumber#270, unitClassForSubLineNumber#271, eventTypeId#272, originTypeId#273, compensationDate#274, compensationDate_sk#275L, billToaddressseq#276L, billToCustid#277, shipToAddressseq#278L, shipToCustid#279, otherToAddressseq#280L, otherToCustid#281, isRunnable#282L, ... 108 more fields]
      +- Window [row_number() windowspecdefinition(tenantid#383 ASC NULLS FIRST, salestransaction_sk#261L ASC NULLS FIRST, payeeSeq#392L ASC NULLS FIRST, positionSeq#393L ASC NULLS FIRST, ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS _we0#480], [tenantid#383 ASC NULLS FIRST, salestransaction_sk#261L ASC NULLS FIRST, payeeSeq#392L ASC NULLS FIRST, positionSeq#393L ASC NULLS FIRST]
         +- Project [pa_sk#388L, salesTransactionSeq#262L, salesOrderSeq#263L, 2533274790396085 AS 2533274790396085#478L, payeeSeq#392L, positionSeq#393L, orderId#265, lineNumber#266L, unitNameForLineNumber#267, unitClassForLineNumber#268, subLineNumber#269L, unitNameForSubLineNumber#270, unitClassForSubLineNumber#271, eventTypeId#272, originTypeId#273, compensationDate#274, compensationDate_sk#275L, billToaddressseq#276L, billToCustid#277, shipToAddressseq#278L, shipToCustid#279, otherToAddressseq#280L, otherToCustid#281, isRunnable#282L, ... 106 more fields]
            +- Join Inner, (((salestransactionseq#262L = salestransactionseq#395L) && (pc_tenant#384 = DEV2)) && ((pc_processingunit#385L = 38280596832649318) && (pc_period#386L = 2533274790396085)))
               :- Join Cross
               :  :- SubqueryAlias txn
               :  :  +- SubqueryAlias factsalestransactioncube
               :  :     +- Relation[salestransaction_sk#261L,salestransactionseq#262L,salesorderseq#263L,periodseq#264L,orderid#265,linenumber#266L,unitnameforlinenumber#267,unitclassforlinenumber#268,sublinenumber#269L,unitnameforsublinenumber#270,unitclassforsublinenumber#271,eventtypeid#272,origintypeid#273,compensationdate#274,compensationdate_sk#275L,billtoaddressseq#276L,billtocustid#277,shiptoaddressseq#278L,shiptocustid#279,othertoaddressseq#280L,othertocustid#281,isrunnable#282L,accountingdate#283,productid#284,... 102 more fields] parquet
               :  +- SubqueryAlias sk
               :     +- Aggregate [if (isnull(max(patransaction_sk#1L))) cast(0 as bigint) else max(patransaction_sk#1L) AS max_sk#0L]
               :        +- Filter (pc_tenant#127 = DEV2)
               :           +- SubqueryAlias factpatransactioncube
               :              +- Relation[patransaction_sk#1L,pa_sk#2L,salestransactionseq#3L,salesorderseq#4L,periodseq#5L,payeeseq#6L,positionseq#7L,orderid#8,linenumber#9L,unitnameforlinenumber#10,unitclassforlinenumber#11,sublinenumber#12L,unitnameforsublinenumber#13,unitclassforsublinenumber#14,eventtypeid#15,origintypeid#16,compensationdate#17,compensationdate_sk#18L,billtoaddressseq#19L,billtocustid#20,shiptoaddressseq#21L,shiptocustid#22,othertoaddressseq#23L,othertocustid#24,... 105 more fields] parquet
               +- SubqueryAlias cr_dist
                  +- Distinct
                     +- Project [salesTransactionSeq#395L, positionSeq#393L, payeeSeq#392L, pa_sk#388L, positiongroup#472]
                        +- Filter (((pc_tenant#474 = DEV2) && (pc_processingunit#475L = 38280596832649318)) && (pc_period#476L = 2533274790396085))
                           +- SubqueryAlias factcreditcube
                              +- Relation[credit_sk#387L,pa_sk#388L,salestransaction_sk#389L,releasedate_sk#390L,creditseq#391L,payeeseq#392L,positionseq#393L,salesorderseq#394L,salestransactionseq#395L,orderid#396,orderlevelcredit#397,directcredit#398,periodseq#399L,credittypeseq#400L,credittypeid#401,name#402,pipelinerunseq#403L,origintypeid#404,compensationdate#405,pipelinerundate#406,businessunitmap#407L,preadjustedvalue#408,unitnameforpreadjustedvalue#409,unitclassforpreadjustedvalue#410,... 66 more fields] parquet

== Analyzed Logical Plan ==
patransaction_sk: bigint, pa_sk: bigint, salestransactionseq: bigint, salesorderseq: bigint, periodseq: bigint, payeeseq: bigint, positionseq: bigint, orderid: string, linenumber: bigint, unitnameforlinenumber: string, unitclassforlinenumber: string, sublinenumber: bigint, unitnameforsublinenumber: string, unitclassforsublinenumber: string, eventtypeid: string, origintypeid: string, compensationdate: timestamp, compensationdate_sk: bigint, billtoaddressseq: bigint, billtocustid: string, shiptoaddressseq: bigint, shiptocustid: string, othertoaddressseq: bigint, othertocustid: string, ... 105 more fields
Project [cast((CAST(row_number() OVER (ORDER BY tenantid ASC NULLS FIRST, salestransaction_sk ASC NULLS FIRST, payeeSeq ASC NULLS FIRST, positionSeq ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS BIGINT) + max_sk)#479L as bigint) AS patransaction_sk#1086L, pa_sk#388L, cast(salesTransactionSeq#262L as bigint) AS salestransactionseq#1087L, cast(salesOrderSeq#263L as bigint) AS salesorderseq#1088L, cast(2533274790396085#478L as bigint) AS periodseq#1089L, cast(payeeSeq#392L as bigint) AS payeeseq#1090L, cast(positionSeq#393L as bigint) AS positionseq#1091L, cast(orderId#265 as string) AS orderid#1092, cast(lineNumber#266L as bigint) AS linenumber#1093L, cast(unitNameForLineNumber#267 as string) AS unitnameforlinenumber#1094, cast(unitClassForLineNumber#268 as string) AS unitclassforlinenumber#1095, cast(subLineNumber#269L as bigint) AS sublinenumber#1096L, cast(unitNameForSubLineNumber#270 as string) AS unitnameforsublinenumber#1097, cast(unitClassForSubLineNumber#271 as string) AS unitclassforsublinenumber#1098, cast(eventTypeId#272 as string) AS eventtypeid#1099, cast(originTypeId#273 as string) AS origintypeid#1100, cast(compensationDate#274 as timestamp) AS compensationdate#1101, cast(compensationDate_sk#275L as bigint) AS compensationdate_sk#1102L, cast(billToaddressseq#276L as bigint) AS billtoaddressseq#1103L, cast(billToCustid#277 as string) AS billtocustid#1104, cast(shipToAddressseq#278L as bigint) AS shiptoaddressseq#1105L, cast(shipToCustid#279 as string) AS shiptocustid#1106, cast(otherToAddressseq#280L as bigint) AS othertoaddressseq#1107L, cast(otherToCustid#281 as string) AS othertocustid#1108, ... 105 more fields]
+- Project [(CAST(row_number() OVER (ORDER BY tenantid ASC NULLS FIRST, salestransaction_sk ASC NULLS FIRST, payeeSeq ASC NULLS FIRST, positionSeq ASC NULLS FIRST ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS BIGINT) + max_sk)#479L, pa_sk#388L, salesTransactionSeq#262L, salesOrderSeq#263L, 2533274790396085#478L, payeeSeq#392L, positionSeq#393L, orderId#265, lineNumber#266L, unitNameForLineNumber#267, unitClassForLineNumber#268, subLineNumber#269L, unitNameForSubLineNumber#270, unitClassForSubLineNumber#271, eventTypeId#272, originTypeId#273, compensationDate#274, compensationDate_sk#275L, billToaddressseq#276L, billToCustid#277, shipToAddressseq#278L, shipToCustid#279, otherToAddressseq#280L, otherToCustid#281, ... 105 more fields]
   +- Project [pa_sk#388L, salesTransactionSeq#262L, salesOrderSeq#263L, 2533274790396085#478L, payeeSeq#392L, positionSeq#393L, orderId#265, lineNumber#266L, unitNameForLineNumber#267, unitClassForLineNumber#268, subLineNumber#269L, unitNameForSubLineNumber#270, unitClassForSubLineNumber#271, eventTypeId#272, originTypeId#273, compensationDate#274, compensationDate_sk#275L, billToaddressseq#276L, billToCustid#277, shipToAddressseq#278L, shipToCustid#279

ankurchourasiya commented 5 years ago

I also suffered the same issue, could you resolve it.

fusonghe commented 5 years ago

@ @ @

BhanuPatibandla commented 3 years ago

Has anyone fixed this issue

linkedin / dr-elephant

Exception parsing Spark event log #280