Closed ericbusboom closed 9 years ago
A. The schema is the partitions table and columns. For partition p, p.table and p.table.columns. The table schema is available after running the meta phase.
A. It’s referenced in the documentation for Multicorn. http://multicorn.org/implementing-an-fdw/
A. After building a bundle,
partitions = list(b.partitions)
p0 = partitions[0]
datafile = p0.datafile
system_path = datafile.syspath
p0.datafile will only work if the partition is fetched from the bundle, and specifically, if bundle.wrap_partition() is called on the partition.
A. The typical case will be user creates a VIEW in the warehouse that uses table data, and rather than installing the data in the warehouse as was done pre 1.0, in this case, the view uses the partition data from a FDW. A lot of the time, we’ll create a MATERIALIZED VIEW that references the foreign table, and users will use the materialized view, to allow faster access, and ( I hope ) indexes.
A. Delete everything under ambry/warehouse and put it as a module there.
Q. How can FDW recognize compressed file? Should it read compressed or raw or both? A. For now, just assume they are compressed
Both implementations moved to https://github.com/CivicKnowledge/ambry_sources . See CivicKnowledge/ambry_sources#1 for further progress.
Write a Foreign Data Wrapper for that can read and query PartitionMsgpackDataFile format files. The goal is to be able to attach these files to a warehouse without actually loading them in. It should be possible to write queries with them, and combine them together into views.
There should also be code that will generate the FDW creation DDL (CREATE FOREIGN TABLE), including the proper schema for the table for the specific data file.
Also implement the Qual object to do efficient filtering on the returned values.
Pay particular attention to the lifesystem of the FDW, and the creation of server processes associated with them. The system must be able to handles hundred to thousands of wrappers simultaneously.
Make sure it works with Sqalchemy: http://multicorn.org/sqlalchemy_fdw/. Use the sqalchemy FDW support for creating the FDW in the warehouse.
Create the FDW servers in a schema in the library database, parallel to the ambrylib schema, in a schema named 'partitions'
Similarly, there virtual tables can be developed for Sqlite, using http://apidoc.apsw.googlecode.com/hg/vtable.html