Mechanism for creating and populating tables

There are multiple examples where we want to supply supplemental data to be joined in with data used for analysis:

mapping data to decode status codes received from services. for example: pardot visitor_activities has type and type_name that we're decoding and then mapping to event_action.
creating calendar tables. there isn't a standard way to do this in redshift and, after much investigation, the best process for doing this really is joining against a table with all relevant dates in it.

In order to deal with this, we need a procedural way to build and tear down datasets that is 100% integrated into the way we're deploying models. This should likely look like a python script that is integrated into runner.py, and runs multiple python files, each of them responsible for building or tearing down a particular table. Some tables will be best built via code and some will be best loaded from a CSV, so it should be flexible enough to handle multiple methods of data population.

analyst-collective / analytics

Mechanism for creating and populating tables #9