tresata / spark-sorted

Secondary sort and streaming reduce for Apache Spark
Apache License 2.0
78 stars 17 forks source link

Getting a strange exception... #8

Closed du291 closed 8 years ago

du291 commented 8 years ago

Code ... val result = rdd .groupSort(Ordering.by{ ...some logic... }) .scanLeftByKey(... initial value ...){ ... some logic ...} .flatMap{ ... }.filter{ ... }

Exception ... Exception in thread "main" java.lang.NoSuchMethodError: org.apache.spark.rdd.ShuffledRDD.(Lorg/apache/spark/rdd/RDD;Lorg/apache/spark/Partitioner;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;Lscala/reflect/ClassTag;)V at com.tresata.spark.sorted.PairRDDFunctions.groupSort(PairRDDFunctions.scala:29) at com.tresata.spark.sorted.PairRDDFunctions.groupSort(PairRDDFunctions.scala:48) at spikes.MyJob$.main(MyJob.scala:15) at spikes.MyJob.main(MyJob.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:674) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Spark version 1.5.2. Any ideas?

koertkuipers commented 8 years ago

what version of spark-sorted?

du291 commented 8 years ago

Thanks for quick answer. I use 0.6.0 from maven central w/scala 2.10.

koertkuipers commented 8 years ago

looks like spark made some changes in ShuffledRDD constructor (added implicit class tags) between versions 1.5.x and 1.6.x which is causing this issue. spark-sorted 0.6.0 is compiled against spark 1.6.1

 diff spark-1.5.2/core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala spark-1.6.1/core/src/main/scala/org/apache/spark/rdd/ShuffledRDD.scala
> import scala.reflect.ClassTag
> 
40c42
< class ShuffledRDD[K, V, C](
---
> class ShuffledRDD[K: ClassTag, V: ClassTag, C: ClassTag](
84a87,92

if you have the ability to publish in-house artifacts you can create a version of spark-sorted compiled against spark 1.5.x (i expect no issues here), or alternatively upgrade to spark 1.6.x yourself.

du291 commented 8 years ago

Got it. Will use 0.5.0 which seems to work, until I get chance to upgrade Spark or recompile.

koertkuipers commented 8 years ago

ok good luck