HCADatalab / powderkeg

Live-coding the cluster!
Eclipse Public License 1.0
159 stars 23 forks source link

CIDER + Spark2 = ClassNotFoundException: com.sun.javadoc.ConstructorDoc #40

Open jasonjckn opened 7 years ago

jasonjckn commented 7 years ago
(ns powerkeg-test.core
  (:require [cider.nrepl]))

(defn main
  [& args]
  (let [port
        5557

        server
        (clojure.tools.nrepl.server/start-server
         :port port
         :bind "0.0.0.0"
         :handler cider.nrepl/cider-nrepl-handler)]

    (println"INIT NREPL server running on port" port)
    (.addShutdownHook (Runtime/getRuntime) (Thread. #(.close server))))

  (clojure.main/main "-r" ))
spark-submit --class powderkeg.repl powerkeg-test-0.1.0-SNAPSHOT-standalone.jar powerkeg-test.core/main

CIDER session

(ns powerkeg-test.test
  (:require [powderkeg.core :as keg]))

(into [] (keg/rdd (range 10)))

error =>

Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: com.sun.javadoc.ConstructorDoc
Serialization trace:
vars (powderkeg.core$barrier_BANG___1610$fn__1674$fn__1675)
fn (clojure.lang.Delay)
    at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:156)
    at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:133)
    at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:670)
    at powderkeg.kryo$read__97.invokeStatic(kryo.clj:31)
    at powderkeg.kryo$read__97.invoke(kryo.clj:30)
    at powderkeg.SerializerStub.read(SerializerStub.java:27)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at carbonite.serializer$read_map$fn__48.invoke(serializer.clj:82)
    at carbonite.serializer$read_map.invokeStatic(serializer.clj:76)
    at carbonite.serializer$read_map.invoke(serializer.clj:71)
    at clojure.lang.Var.invoke(Var.java:383)
    at carbonite.ClojureMapSerializer.read(ClojureMapSerializer.java:27)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at carbonite.serializer$read_map$fn__48.invoke(serializer.clj:83)
    at carbonite.serializer$read_map.invokeStatic(serializer.clj:76)
    at carbonite.serializer$read_map.invoke(serializer.clj:71)
    at clojure.lang.Var.invoke(Var.java:383)
    at carbonite.ClojureMapSerializer.read(ClojureMapSerializer.java:27)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at carbonite.serializer$mk_collection_reader$fn__36.invoke(serializer.clj:57)
    at clojure.lang.Var.invoke(Var.java:383)
    at carbonite.ClojureVecSerializer.read(ClojureVecSerializer.java:17)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at carbonite.serializer$read_map$fn__48.invoke(serializer.clj:83)
    at carbonite.serializer$read_map.invokeStatic(serializer.clj:76)
    at carbonite.serializer$read_map.invoke(serializer.clj:71)
    at clojure.lang.Var.invoke(Var.java:383)
    at carbonite.ClojureMapSerializer.read(ClojureMapSerializer.java:27)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at carbonite.serializer$read_map$fn__48.invoke(serializer.clj:83)
    at carbonite.serializer$read_map.invokeStatic(serializer.clj:76)
    at carbonite.serializer$read_map.invoke(serializer.clj:71)
    at clojure.lang.Var.invoke(Var.java:383)
    at carbonite.ClojureMapSerializer.read(ClojureMapSerializer.java:27)
    at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
    at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
    at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:708)
    at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
    at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:551)
    at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:790)
    at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:244)
    at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$10.apply(TorrentBroadcast.scala:286)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1303)
    at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:287)
    at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:221)
    at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1269)
    ... 27 more
Caused by: java.lang.ClassNotFoundException: com.sun.javadoc.ConstructorDoc
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:154)
    ... 73 more
viesti commented 7 years ago

What version of Spark are you using? Just a hunch, but with Spark 2.x your project needs [com.esotericsoftware/kryo-shaded "4.0.0"] explicitly as a dependency in addition to powderkeg.

jasonjckn commented 7 years ago

I'm using spark 2.1 from Google DataProc cloud services and kryo 4.0.0

jasonjckn commented 7 years ago
 :dependencies [[org.clojure/clojure "1.8.0"]
                 [cider/cider-nrepl "0.15.0-SNAPSHOT"]
                 [hcadatalab/powderkeg "0.5.1"]
                 [com.esotericsoftware/kryo-shaded "4.0.0"]  ;; For Spark 2.x support
                 [org.apache.spark/spark-core_2.11 "2.1.0"]
                 [org.apache.spark/spark-streaming_2.11 "2.1.0"]

Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
Spark context Web UI available at http://10.142.0.7:4040
Spark context available as 'sc' (master = yarn, app id = application_1495147481256_0004).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.1.0
      /_/

Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_121)
Type in expressions to have them evaluated.
Type :help for more information.

scala> spark.version
res0: String = 2.1.0

scala```
cgrand commented 7 years ago

Cider seems to has a dependency on tools.jar (the missing class is part of it) and it's only available on JDK not JRE. So I guess some namespaces should be blacklisted to not be sent to workers.

jasonjckn commented 7 years ago

According to the above i'm on jdk "Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_121)"

maybe kyro serialization could ignore, or have registered com.sun.javadoc.ConstructorDoc, although I have limited knowledge here.