cmu-db / peloton

The Self-Driving Database Management System
http://pelotondb.io
Apache License 2.0
2.04k stars 622 forks source link

Schema - Layout decoupling #1327

Closed poojanilangekar closed 6 years ago

poojanilangekar commented 6 years ago

This PR decouples the logical schema from the physical layout of the data. Previously, each TileGroup contained a vector of schemas for which violates the notion of the storage layer being a physical store. We also had cases where the LayoutTuner modifies the column_map which ended up modifying the schemas vector via the DataTable. There were multiple other instances where the Schema was used to get the layout of a TileGroup while all we needed was the physical mapping between column_ids and a <tile_id, column_offset> pair. This PR now makes it possible to make schema changes like changing column name, adding & dropping columns without making any calls to the storage layer (i.e., It can now be done entirely in the catalog).

Changes:

Testing:

Disclaimer:

@mengranwo Can you please review the catalog changes? @pmenon Can you please review the changes to the LLVM engine? Mainly, RuntimeFunctions, Inserter, Updater, TestingCodegenUtil and TableScanTranslatorTests. @pervazea Can you please review the overall changes? I have modified a few APIs in the old engine. The current code doesn't break the existing tests and it seems to function as expected. But please let me know if you think there is anything I need to change. @jarulraj I know you may not have the bandwidth for this, but if possible could you please review the changes to LayoutTuner and LayoutTunerTest. Nobody else has modified that code and it would be great if you could take a look.

coveralls commented 6 years ago

Coverage Status

Coverage increased (+0.1%) to 77.551% when pulling 38a561484d4d0dbcc150f33cf7db5981164d3386 on poojanilangekar:master into d68ab719fdd9499f23ac5887f907ed07ce4d2644 on cmu-db:master.

mengranwo commented 6 years ago

@poojanilangekar Hi, this PR looks good for me except for some minor changes involved and some questions I posted in the comment area. I mainly focus on the logic of catalog & its corresponding test cases and I also take a look at the layout.h/layout.cpp, it stands.

And thank you for fix the typo in DEFAULT_SCHEMA_NAME!