Netflix / PigPen

Map-Reduce for Clojure
Apache License 2.0
566 stars 55 forks source link

pigpen.core store functions don't quite work #173

Closed punit-naik closed 8 years ago

punit-naik commented 8 years ago

I have a relation called data the dump of which looks like this:

user=> (pig/dump data)
(["2014-12-30" "240465" "18960"] ["2014-12-14" "179355" "14295"] ["2014-12-05" "310706" "4362"] ["2014-12-03" "293209" "14426"] ["2014-11-06" "134263" "10942"] ["2014-09-24" "293763" "18773"] ["2014-09-19" "76412" "6465"] ["2014-08-30" "57813" "18315"] ["2014-08-17" "140325" "15851"] ["2014-08-10" "99408" "10242"] ["2014-07-27" "283646" "7777"] ["2014-07-21" "282190" "6690"] ["2014-07-10" "280308" "6724"] ["2014-04-16" "177294" "5958"] ["2014-04-16" "9630" "9948"] ["2014-04-15" "249966" "7948"] ["2014-03-24" "174913" "7201"] ["2014-03-12" "257354" "4703"] ["2014-01-30" "250252" "45902"] ["2014-01-23" "226169" "7084"] ["2013-12-19" "166220" "5963"] ["2013-12-07" "143208" "6300"] ["2013-11-28" "146795" "11381"] ["2013-08-29" "163403" "17608"] ["2013-08-25" "220460" "5798"] ["2013-07-30" "208614" "2776"] ["2013-07-27" "203026" "5627"] ["2013-07-03" "195602" "19356"])

But when I do a (pig/store-tsv "test.tsv" data), it executes but does not create any file in the working directory. Same goes to pig/store-string.

What am I doing wrong?

mbossenbroek commented 8 years ago

This one is a little counter-intuitive. When you call the store commands, pigpen doesn't immediately write the output to a file. The reason for this is that you might want to store many things and have them all run at once.

What's returned by the store commands is just a query plan, just like any other operator. You can call pig/dump on the result to execute it. It will return both the results, and write them to a file.

We have considered adding a pig/run to force execution (and be explicit that we were looking for store commands to write), but it's been lower priority since pig/dump does the same thing.

-Matt

On Wednesday, February 24, 2016 at 12:18 AM, Punit Naik wrote:

I have a relation called data the dump of which looks like this: user=> (pig/dump data) (["2014-12-30" "240465" "18960"] ["2014-12-14" "179355" "14295"] ["2014-12-05" "310706" "4362"] ["2014-12-03" "293209" "14426"] ["2014-11-06" "134263" "10942"] ["2014-09-24" "293763" "18773"] ["2014-09-19" "76412" "6465"] ["2014-08-30" "57813" "18315"] ["2014-08-17" "140325" "15851"] ["2014-08-10" "99408" "10242"] ["2014-07-27" "283646" "7777"] ["2014-07-21" "282190" "6690"] ["2014-07-10" "280308" "6724"] ["2014-04-16" "177294" "5958"] ["2014-04-16" "9630" "9948"] ["2014-04-15" "249966" "7948"] ["2014-03-24" "174913" "7201"] ["2014-03-12" "257354" "4703"] ["2014-01-30" "250252" "45902"] ["2014-01-23" "226169" "7084"] ["2013-12-19" "166220" "5963"] ["2013-12-07" "143208" "6300"] ["2013-11-28" "146795" "11381"] ["2013-08-29" "163403" "17608"] ["2013-08-25" "220460" "5798"] ["2013-07-30" "208614" "2776"] ["2013-07-27" "203026" "5627"] ["2013-07-03" "195602" "19356"])
But when I do a (pig/store-tsv "test.tsv" data), it executes but does not create any file in the working directory. Same goes to pig/store-string. What am I doing wrong?

— Reply to this email directly or view it on GitHub (https://github.com/Netflix/PigPen/issues/173).

punit-naik commented 8 years ago

Thanks a lot @mbossenbroek for clearing my doubts.