FRosner / spawncamping-dds

Data-Driven Spark allows quick data exploration based on Apache Spark.
Other
28 stars 15 forks source link

Support Spark 1.5.x #264

Closed ssimeonov closed 8 years ago

ssimeonov commented 8 years ago

On 1.5.2 the README show(golf) operation generates:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost): java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.package$.Row()Lorg/apache/spark/sql/Row$;
    at de.frosner.dds.datasets.package$$anonfun$4.apply(package.scala:75)
    at de.frosner.dds.datasets.package$$anonfun$4.apply(package.scala:73)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
FRosner commented 8 years ago

Thanks for reporting this issue @ssimeonov. We are heavily working on #255 right now so I am not sure when I am able to bump the Spark version. But it is planned to happen this year still :+1:

Does that work for you or do you need it sooner? The other functions should work most of the time. The example datasets are giving me the most problems when upgrading.

ssimeonov commented 8 years ago

@FRosner I'm blocked on this; I guess it's back to SparkR and ggplot...

FRosner commented 8 years ago

@ssimeonov sorry to hear that. Have you tried working with your existing data instead of the example data set? It might be that only the example dataset functions are not working with 1.5.x, so you might want to give it a try.

However, our plan is ultimately to also offer a front-end integration with SparkR and PySpark. This will be the next step after the integration in Spark Notebooks (which is Scala, still).

fabsta commented 8 years ago

Just adding, I am on Spark 1.5.1 and running show(golf) gives me: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.package$.Row()Lorg/apache/spark/sql/Row

FRosner commented 8 years ago

@fabsta yep that's the same problem that @ssimeonov is facing. It is because of the changes in the SQL API that happened from 1.3 to 1.4 to 1.5. As soon as I am ready with #253 (part of #255), I will bump the version of Spark to 1.5.

FRosner commented 8 years ago

@ssimeonov getting closer to the completion of other tasks so I can start on this one. However, I tried using DDS with Spark 1.5.2 yesterday and it seems to work for the core functions. So you can try using your own data rather than the example dataset and nevertheless use Spark 1.5.2.

But anyway it might be a good choice to wait a little bit as we are currently rewriting the project completely to make it more modular. The UI will also be rewritten to be more consistent in terms of visualization configuration and it will be pluggable to different notebook implementations (Spark Notebook, Zeppelin, etc.).

FRosner commented 8 years ago

@ssimeonov closing this one. The next release 4.0.0-alpha will contain support for Spark 1.5.x. Please check the release page tonight :)