cal-itp / data-infra

Cal-ITP data infrastructure
https://docs.calitp.org/data-infra
GNU Affero General Public License v3.0
48 stars 13 forks source link

Bring new NTD endpoint sources into the warehouse as staging #3467

Closed charlie-costanzo closed 2 months ago

charlie-costanzo commented 2 months ago

Description

This PR brings in the new external NTD tables created in #3465 as staging tables in the warehouse. As staging tables, we are introducing basic cleaning and typecasting as necessary for further analytical use downstream.

Because we have to pull the whole files on extract, these staging tables filter for the most recent extract_ts on creation.

This work also includes the required source and staging .yml files as part of the dbt project, and early documentation to be built upon iteratively from here.

This PR seeks to satisfy #3404, part of Epic #3401, building upon recent work found in #3415 and #3465.

Resolves #3404

Type of change

How has this been tested?

locally with dbt

Post-merge follow-ups

github-actions[bot] commented 2 months ago

Warehouse report 📦

Checks/potential follow-ups

Checks indicate the following action items may be necessary.

New models 🌱

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__breakdowns

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__breakdowns_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__capital_expenses_by_capital_use

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__capital_expenses_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__capital_expenses_for_existing_service

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__capital_expenses_for_expansion_of_service

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__employees_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__employees_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__employees_by_mode_and_employee_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__fuel_and_energy

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__fuel_and_energy_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_by_expense_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_directly_generated

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_federal

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_local

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_state

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__funding_sources_taxes_levied_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__maintenance_facilities

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__maintenance_facilities_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__metrics

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__operating_expenses_by_function

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__operating_expenses_by_function_and_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__operating_expenses_by_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__operating_expenses_by_type_and_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__service_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__service_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__service_by_mode_and_time_period

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__stations_and_facilities_by_agency_and_facility_type

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__stations_by_mode_and_age

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__track_and_roadway_by_agency

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__track_and_roadway_by_mode

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__track_and_roadway_guideway_age_distribution

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__vehicles_age_distribution

calitp_warehouse.staging.ntd_annual_data_tables.2022.stg_ntd_annualdata\_2022__vehicles_type_count_by_agency

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_calendar_year_upt

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_calendar_year_vrm

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_master

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_upt

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_upt_estimates

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_voms

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_vrh

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_vrm

calitp_warehouse.staging.ntd_ridership.stg_ntd_ridershiphistorical\_complete_monthly_ridership_with_adjustments_andestimates\_vrm_estimates

DAG

Legend (in order of precedence)

Resource type Indicator Resolution
Large table-materialized model Orange Make the model incremental
Large model without partitioning or clustering Orange Add partitioning and/or clustering
View with more than one child Yellow Materialize as a table or incremental
Incremental Light green
Table Green
View White