HCADatalab / powderkeg

Live-coding the cluster!
Eclipse Public License 1.0
159 stars 23 forks source link

Implementations of Protocol methods are not sent to the nodes #11

Closed myu07 closed 7 years ago

myu07 commented 7 years ago

When running in yarn-client mode, I'm getting an exception when invoking clojure.data.json's write-str function.

(into [] (-> (vector (hash-map :a 5))
             (keg/rdd
                  (map #(json/write-str %)))))

java.lang.Exception: Don't know how to write JSON of class java.lang.Long

I was able to narrow down the root cause. Basically, the implementations of clojure.data.json's JSONWriter protocol "-write" method do not exist in the nodes.

(extend java.lang.Long JSONWriter {:-write write-plain})

I was able to verify the root cause after executing these simple lines of codes in the repl:

(defprotocol NumProtocol
  (-write-number [object]
    "My Protocol"))

(defn- write-long [x]
  (str "Long val: " x))

(extend java.lang.Long NumProtocol {:-write-number write-long})

(into [] (-> (vector 6)
             (keg/rdd
                  (map #(-write-number % )))))

IllegalArgumentException No implementation of method: :-write-number of protocol: #'user/NumProtocol found for class: java.lang.Long
myu07 commented 7 years ago

FYI. Still getting this problem with 0.4.3-SNAPSHOT.

cgrand commented 7 years ago

This issue surprised me a lot as we do use protocols. I've narrowed the problem it appears only with boxed scalar types. It's a ser/deser issue not specific to protocols:

=> (into [] (keg/rdd [Boolean Long Double Float Byte Character] (map identity)))
[boolean long double float byte char]