MattTriano / analytics_data_where_house

An analytics engineering sandbox focusing on real estates prices in Cook County, IL
https://docs.analytics-data-where-house.dev/
GNU Affero General Public License v3.0
9 stars 0 forks source link

Refactor dbt model names to match schema names #86

Closed MattTriano closed 1 year ago

MattTriano commented 1 year ago

This will involve changing staging to data_raw and intermediate to clean throughout task functions, documentation, and dbt model sql files.

MattTriano commented 1 year ago

Started refactoring. I've split intermediate (which contained both the standardized and clean stages) into separate standardized and clean model groups, and I've also added a standardized schema to the data warehouse. I've attached the checklist I used to make sure all references were handled.

I still have to refactor the staging model group to data_raw.

dbt_intermediate_modelgroup_refactor.md

MattTriano commented 1 year ago

Refactored staging to data_raw after work, and attached the checklist I produced via

grep -r -n "staging" --include=*.{py,sql,yml,yaml,md} --exclude-dir={.venv,dev_utils,notes} . | sort | > dbt_staging_modelgroup_refactor.md

I also checked through expectation suites and checkpoints in /great_expectations/ (they're .json, and dbt's manifest.json file must be 1 line and it references many macros containing the string 'staging' that blows up the grep results) and files in /great_expectations/ already used the database schema names rather than dbt model names, so no modifications were needed.

dbt_staging_modelgroup_refactor.md