HCADatalab / powderkeg

Live-coding the cluster!
Eclipse Public License 1.0
159 stars 23 forks source link

powderkeg repl hangup when run on yarn-client mode #22

Open clojurians-org opened 7 years ago

clojurians-org commented 7 years ago

i run the following command, it run success, but it hangup, i must add the System/exit to force quit.

spark-submit --master yarn-client     --num-executors 4     --class powderkeg.repl target/etl-spark-0.1.0-SNAPSHOT-standalone.jar bolome.agg.d_bolome_user_order_trgx/-main
(defn -main []

  (as-> (keg/rdd (.textFile keg/*sc* "hdfs://192.168.1.3:9000/user/hive/warehouse/agg.db/d_bolome_user_order")
                 #_(.textFile keg/*sc* "/home/spiderdt/work/git/larluo/user/hive/warehouse/agg.db/d_bolome_user_order_test")
                 (map #(clojure.string/split % #"\001" 2))
                 (map (fn [[user-id user-tkvs]]
                        (->> [user-id
                              (->> user-tkvs
                                   (realize-trgx (latest-schema))
                                   (derive-exprs (latest-exprs))
                                   pr-str)]
                             into-array
                             RowFactory/create))) )
      $
    (.createDataFrame (->> keg/*sc* .sc (new SparkSession)) $
                      (DataTypes/createStructType (map #(DataTypes/createStructField % DataTypes/StringType false) ["user-id" "user-trgx"])))
    (.write $)
    (.format $ "parquet")
    (.mode $ SaveMode/Overwrite)
    (.save $ "hdfs://192.168.1.3:9000/user/hive/warehouse/agg.db/d_bolome_user_order_trgx"))

  (System/exit 0)
  )