uwescience / myria

Myria is a scalable Analytics-as-a-Service platform based on relational algebra.
myria.cs.washington.edu
Other
112 stars 46 forks source link

Write relations to HDFS #729

Open bmyerz opened 9 years ago

bmyerz commented 9 years ago

Motivation: I want a way to get large tables out of Myria.

Depends on moving serialization out to the workers: #705

domoritz commented 9 years ago

Also parquet #482 would be nice for fast read/write in columnar format.

bmyerz commented 9 years ago

Yes, Parquet is a more appropriate storage backend since it remains table-like. The main use case that I want to support with HDFS is bulk transfer or sharing of large datasets. If Parquet satisfies that and has an easy client to interface with then it fits the bill.

senderista commented 9 years ago

See also Postgres support for ORCFile: https://github.com/gokhankici/orc_fdw