COOL-cohort / COOL

the source code of the COOL system
https://www.comp.nus.edu.sg/~dbsystem/cool/
Apache License 2.0
45 stars 16 forks source link

Add extension module with Parquet loading support #2

Closed hugy718 closed 2 years ago

hugy718 commented 2 years ago

This PR made the original codes as the core module for COOL and add an extension module cool-extensions for supplementary functionalities (e.g. the new Parquet Loader in this PR) and a module cool-examples for sample codes and tutorials to be described in the COOL-site. Below are a detailed list of changes.

  1. Refactor LocalLoader into DataLoader for easier extension with other format regulated by the new DataLoaderConfig class. New extensions only needs to extend the DataLoaderConfig to provide appropriate TupleReader and TupleParser.
  2. Add a new implementation of TupleParser that accepts a tuple in Json String.
  3. Add Parquet loading support with ParquetDataLoaderConfig accompanied with a new TupleReader implementation that can emit tuples in Json String from Parquet files.
  4. The old data loading example with logic inside LocalLoader main method is moved to cool-examples. After building the executable jar behaves the same as LocalLoader main method as before. (This is to keep our core module clean)
  5. A example to demonstrate the usage of the new Parquet Loader is also provided.
KimballCai commented 2 years ago
  1. Refactor LocalLoader and Parquet Loader. (check the cool-site for more details and examples)
  2. Update the code organization.