COOL-cohort / COOL

the source code of the COOL system
https://www.comp.nus.edu.sg/~dbsystem/cool/
Apache License 2.0
45 stars 16 forks source link

Cohort Exploration #22

Closed KimballCai closed 2 years ago

KimballCai commented 2 years ago

Summary: This PR adds basic support for cohort exploration by displaying the records of a cohort created from a previous cohort selection.

Description: Cohort exploration aims to let users inspect the records from a cohort of users, either visually or by exporting them to portable formats for further analysis. Currently, we have a writer that displays all records in the terminal, but the writer can be easily extended to save them as COOL's cube or other data formats (will support this later.)

Other changes:

Add KeyFieldIterator class: this abstracts the logic of iterating over a UserKey or AppKey field and exporting the start offset and end offset in a cublet. This is used in other places besides cohort exploration, we can replace them with this abstraction later. It also gives us a single place to modify if our UserKey or AppKey field is to be supported by other encoding methods other than RLE. Add CoolTupleReader class: the logic of reconstructing all records of a selected cohort of users are contained within this single class, which also gives us an easy way to iterate a cube for other purposes. We may also add filtering support in the future for users to fine-tune the iterator's behavior