Currently this requires a schema to be passed to both the load and store commands - hopefully this can be removed in the future, but for now the Pig Parquet loader requires it.
This also changes how local-mode is done for load and store commands. The new version uses PigPenLocalLoader and PigPenLocalStorage protocols in pigpen-core/src/main/clojure/pigpen/local.clj to implement local versions of load/store commands.
I added a bunch of code to assist in running hadoop and pig classes directly. Hopefully these should be useful when adding new formats.
Add parquet support to PigPen. See https://github.com/apache/incubator-parquet-mr
Thanks @mping for the idea and a lot of the initial work on this one.
Usage:
Currently this requires a schema to be passed to both the load and store commands - hopefully this can be removed in the future, but for now the Pig Parquet loader requires it.
This also changes how local-mode is done for load and store commands. The new version uses
PigPenLocalLoader
andPigPenLocalStorage
protocols inpigpen-core/src/main/clojure/pigpen/local.clj
to implement local versions of load/store commands.I added a bunch of code to assist in running hadoop and pig classes directly. Hopefully these should be useful when adding new formats.
cc @daveray