Closed erikamov closed 3 weeks ago
Warehouse report 📦
Checks indicate the following action items may be necessary.
calitp_warehouse.mart.ntd.dim_annual_service_agencies
calitp_warehouse.mart.ntd.dim_annual_service_mode_time_periods
calitp_warehouse.mart.ntd.fct_annual_service_modes
Legend (in order of precedence)
Resource type | Indicator | Resolution |
---|---|---|
Large table-materialized model | Orange | Make the model incremental |
Large model without partitioning or clustering | Orange | Add partitioning and/or clustering |
View with more than one child | Yellow | Materialize as a table or incremental |
Incremental | Light green | |
Table | Green | |
View | White |
stg_ntd_annual_data__2022__service_by_agency
have the max_
or sum_
prefix?
mart_ntd
models because those can benefit from the docs macros the most. ex: GTFS trips.txt macros used in _mart_gtfs_fcts.ymljovyan@jupyter-tiffanychu90 ~/data-infra/warehouse (3396-ntd-agency) $ poetry run dbt run -s +"models/mart/ntd/dim_annual_service_agencies.sql"
22:03:36 Running with dbt=1.5.1
22:03:39 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.calitp_warehouse.mart.ad_hoc
22:03:40 Found 422 models, 973 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 175 sources, 4 exposures, 0 metrics, 0 groups
22:03:40
22:03:44 Concurrency: 8 threads (target='dev')
22:03:44
22:03:44 1 of 2 START sql view model tiffany_staging.stg_ntd_annual_data__2022__service_by_agency [RUN]
22:03:45 1 of 2 OK created sql view model tiffany_staging.stg_ntd_annual_data__2022__service_by_agency [CREATE VIEW (0 processed) in 1.07s]
22:03:45 2 of 2 START sql table model tiffany_mart_ntd.dim_annual_service_agencies ...... [RUN]
22:03:48 2 of 2 OK created sql table model tiffany_mart_ntd.dim_annual_service_agencies . [CREATE TABLE (2.2k rows, 7.9 MiB processed) in 3.61s]
22:03:48
22:03:48 Finished running 1 view model, 1 table model in 0 hours 0 minutes and 8.41 seconds (8.41s).
22:03:49
22:03:49 Completed successfully
jovyan@jupyter-tiffanychu90 ~/data-infra/warehouse (3396-ntd-agency) $ poetry run dbt run -s +models/mart/ntd/fct_annual_service_modes.sql
22:05:34 Running with dbt=1.5.1
22:05:37 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.calitp_warehouse.mart.ad_hoc
22:05:38 Found 422 models, 973 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 175 sources, 4 exposures, 0 metrics, 0 groups
22:05:38
22:05:42 Concurrency: 8 threads (target='dev')
22:05:42
22:05:42 1 of 2 START sql view model tiffany_staging.stg_ntd_annual_data__2022__service_by_mode [RUN]
22:05:43 1 of 2 OK created sql view model tiffany_staging.stg_ntd_annual_data__2022__service_by_mode [CREATE VIEW (0 processed) in 1.01s]
22:05:43 2 of 2 START sql table model tiffany_mart_ntd.fct_annual_service_modes ......... [RUN]
22:05:46 2 of 2 OK created sql table model tiffany_mart_ntd.fct_annual_service_modes .... [CREATE TABLE (3.7k rows, 16.2 MiB processed) in 3.59s]
22:05:46
22:05:46 Finished running 1 view model, 1 table model in 0 hours 0 minutes and 8.36 seconds (8.36s).
22:05:47
22:05:47 Completed successfully
22:05:47
- Slack thread
I'll take a TODO to write up a GH issue to do some of the dbt docs + macros and start working on moving columns that are broadly present across many NTD tables into that
- This PR should remain the mart NTD table creation, and a follow-up issue + PR can address docs
One remaining question: why are the columns read in from
stg_ntd_annual_data__2022__service_by_agency
have themax_
orsum_
prefix?
- created here with a rank...so I think the prefixes get added automatically?
- I think that's fine, and I'll relegate the renaming to
mart_ntd
models because those can benefit from the docs macros the most. ex: GTFS trips.txt macros used in _mart_gtfs_fcts.yml
The column names with SUM and MAX actually came from the original NTD API data. I just kept them like it came to us, but I agree with you that is better without it. Thank you for all changes and information. 🤩
Description
This PR replaces the existing model dim_annual_ntd_agency_service as requested on ticket #3396 to use the new 2022 external NTD Annual Data.
Following these details sent by @tiffanychu90:
I created these new models, with documentation and basic test:
dim_annual_service_agencies
dim_annual_service_mode_time_periods
fct_annual_service_modes
Type of change
How has this been tested?
The new models and documentation were tested locally an created on staging.
Post-merge follow-ups
Confirm the correct creation of the models on
cal-itp-data-infra
.