Closed clojurians-org closed 7 years ago
You should look into keg/join which happens to use this static method. Moreover when you want the pair RDD to be partitioned, you should use keg/by-key instead of keg/rdd. -- On Clojure http://clj-me.cgrand.net/ Clojure Programming http://clojurebook.com Training, Consulting & Contracting http://lambdanext.eu/
i use the keg/join, but it lead to google Optional exception.
(into [] (keg/join (keg/rdd {:a [1 11] :b 2}) :or "x" (keg/rdd {:a "aa" :c "cc"} ) :or "y"))
org.apache.spark.api.java.Optional cannot be cast to
com.google.common.base.Optional
Which spark version are you using?
Le mer. 22 mars 2017 à 14:38, larluo notifications@github.com a écrit :
i use the keg/join, but it lead to google Optional exception.
(into [] (keg/join (keg/rdd {:a [1 11] :b 2}) :or "x"(keg/rdd {:a "aa" :c "cc"} ) :or "y"))
org.apache.spark.api.java.Optional cannot be cast to com.google.common.base.Optional
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/HCADatalab/powderkeg/issues/24#issuecomment-288400477, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC3sR4sZD5uRIk-J7mxdLFTfjfJ7OX6ks5roSRbgaJpZM4MkzQw .
-- On Clojure http://clj-me.cgrand.net/ Clojure Programming http://clojurebook.com Training, Consulting & Contracting http://lambdanext.eu/
(defproject etl-spark "0.1.0-SNAPSHOT"
:description "FIXME: write description"
:url "http://example.com/FIXME"
:license {:name "Eclipse Public License"
:url "http://www.eclipse.org/legal/epl-v10.html"}
:profiles {:provided {:dependencies [[org.apache.spark/spark-core_2.11 "2.1.0"]
[org.apache.spark/spark-sql_2.11 "2.1.0"]
[org.apache.spark/spark-streaming_2.11 "2.1.0"]]}}
:aot :all
:dependencies [[org.clojure/clojure "1.8.0"]
[hcadatalab/powderkeg "0.5.0"]
[clj-time "0.12.2"]])
I checked the docs and indeed the switch in the Optional is a breaking change that hasn't been accounted for. I will fix next week if no one fixes it before.
Le mer. 22 mars 2017 à 14:43, larluo notifications@github.com a écrit :
(defproject etl-spark "0.1.0-SNAPSHOT" :description "FIXME: write description" :url "http://example.com/FIXME" :license {:name "Eclipse Public License" :url "http://www.eclipse.org/legal/epl-v10.html"} :profiles {:provided {:dependencies [[org.apache.spark/spark-core_2.11 "2.1.0"] [org.apache.spark/spark-sql_2.11 "2.1.0"] [org.apache.spark/spark-streaming_2.11 "2.1.0"]]}} :aot :all :dependencies [[org.clojure/clojure "1.8.0"] [hcadatalab/powderkeg "0.5.0"] [clj-time "0.12.2"]])
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/HCADatalab/powderkeg/issues/24#issuecomment-288401811, or mute the thread https://github.com/notifications/unsubscribe-auth/AAC3sdJG1pdi2oqskHbo-ulMvP8emioLks5roSV8gaJpZM4MkzQw .
-- On Clojure http://clj-me.cgrand.net/ Clojure Programming http://clojurebook.com Training, Consulting & Contracting http://lambdanext.eu/
i try to join two pair rdd together. it failed!
it seems the following rdd just become org.apache.spark.api.java.JavaRDD with Tuple2. it need to be wrapped with JavaPairRDD/fromJavaRDD method to be realized.