CCAO Data Infrastructure
This repository stores the code for the CCAO Data Department's ETL
pipelines and data lakehouse. This infrastructure supports the Data Team's
modeling, reporting, and data integrity work.
Quick Links
Repository Structure
- ./dbt contains the models and tests that build our Athena data lakehouse;
dbt mainly acts as a transformation and documentation layer on top of our raw data
- ./docs contains design documents and other supplemental documentation
- ./etl contains ETL scripts used to load raw and slightly cleaned up
data into the lakehouse as dbt sources
- ./socrata contains column transformations for the CCAO's
Open Data Portal assets