Closed ljank closed 9 years ago
This just means that you can't close over a compiled function. For example:
(require '[simple-time.core :as st]))
;; works
(defn format-ts [data]
(pig/map (fn [x] (st/format x :date)) data))
(format-ts my-data)
;; works
(defn format-ts [data format]
(pig/map (fn [x] (st/format x format)) data))
(format-ts my-data :date)
;; won't work because f is compiled
(defn format-ts [data f]
(pig/map f data))
(format-ts my-data (fn [x] (st/format x :date)))
There is a way around this, but it's not officially supported yet:
(defn format-ts [data f]
(pigpen.map/map* f data))
(format-ts my-data (pigpen.code/trap (fn [x] (st/format x :date))))
Let me know if that's not clear or if you have a specific example of what you're trying to do.
Also, check out pigpen-support@googlegroups.com or https://groups.google.com/forum/#!forum/pigpen-support for future questions.
I still get CompilerException java.lang.RuntimeException: No such namespace: st
in cases that meant to be working :\
Could you send a code sample and stack trace that you get?
Might be worth mentioning - any code that you close over needs to be in a file that will end up in the uberjar that goes to hadoop. If you're just in a user ns in a repl, the code I listed won't work.
If that's the case, let me know if that's not clear from the docs & I can update them.
I've spotted that it behaves differently when using pig/return
and loading data from files (same for JSON and Avro). This works just fine:
(require '[simple-time.core :as st])
(defn time->ymd
[data]
(pig/map (fn [entry]
(assoc entry
:ymd (st/format (st/datetime (:time entry)) :date)))
data))
(->> (pig/return [{:time 1425254400010} {:time 1425254400019} {:time 1425254400090}])
(time->ymd)
(pig/dump))
; [{:ymd "2015-03-02", :time 1425254400010}
; {:ymd "2015-03-02", :time 1425254400019}
; {:ymd "2015-03-02", :time 1425254400090}]
For JSON:
(spit "/tmp/events.json" "{\"time\": 1425254400010}\n{\"time\": 1425254400019}\n{\"time\": 1425254400090}")
(->> (pig/load-json "/tmp/events.json")
(time->ymd)
(pig/dump))
CompilerException java.lang.RuntimeException: No such namespace: st
Same error while using Avro.
Yeah, it sounds like you're in a user ns. This complete example works for me:
(ns pigpen-demo.core
(:require [pigpen.core :as pig]
[simple-time.core :as st]))
(defn time->ymd
[data]
(pig/map (fn [entry]
(assoc entry
:ymd (st/format (st/datetime (:time entry)) :date)))
data))
(clojure.pprint/pprint
(->> (pig/return [{:time 1425254400010} {:time 1425254400019} {:time 1425254400090}])
(time->ymd)
(pig/dump)))
(spit "/tmp/events.json" "{\"time\": 1425254400010}\n{\"time\": 1425254400019}\n{\"time\": 1425254400090}")
(clojure.pprint/pprint
(->> (pig/load-json "/tmp/events.json")
(time->ymd)
(pig/dump)))
and produces this output:
[{:ymd "2015-03-01", :time 1425254400010}
{:ymd "2015-03-01", :time 1425254400019}
{:ymd "2015-03-01", :time 1425254400090}]
Start reading from /tmp/events.json
Stop reading from /tmp/events.json
[{:ymd "2015-03-01", :time 1425254400010}
{:ymd "2015-03-01", :time 1425254400019}
{:ymd "2015-03-01", :time 1425254400090}]
nil
If you're in a file & still getting that exception, could you run these commands in the REPL and let me know what you get?
(pigpen.code/trap identity)
(ns-name *ns*)
(pigpen.code/ns-exists *1)
You're right — everything works fine when running from file and being not in a user namespace. Thank you for lightning fast and correct diagnosis!
Next time I'll use mailgroup. Sorry!
As stated in the docs:
We have time in millis in our data and would like to format it as
YYYY-MM-DD
, but that's impossible due to aforementioned reasons :( Are there any workaround to make functions work? Otherwise this statement looks far fetched:Thank you!