CityOfLosAngeles / aqueduct

A shared pipeline for building ETLs and batch jobs that we run at the City of LA for Data Science Projects. Built on Apache Airflow & Civis Platform
Apache License 2.0
21 stars 6 forks source link

Migrate scheduled ETLs to Civis #273

Open hunterowens opened 4 years ago

hunterowens commented 4 years ago

Currently, we have the following DAGs

├── homelessness
│   ├── dag_homeless_merge_into_common_schema.py
│   ├── racer_nightly_to_storage.py
│   ├── static_datasets
│   └── sync_rap_data.py
├── myla311
│   ├── dag_311_cd_outlier_detector.py
│   ├── dag_311_outlier_detector.py
│   ├── dag_311_retrieve_update.py
│   └── LA_City_Council_Districts.csv
├── public-health
│   ├── care-311-to-postgres.py
│   ├── Code55.py
│   ├── Code75.py
│   └── covid19
│       ├── get-help-to-esri.py
│       ├── jhu-county-to-esri.py
│       ├── jhu-to-esri.py
│       ├── README.md
│       ├── shelter_timeseries_current.csv
│       ├── sync-bed-availability-data.py
│       ├── sync-covid-testing-data.py
│       ├── sync-la-cases-data.py
└── transportation
    ├── bikeshare
    │   └── trips.py
    ├── dash
    │   └── trips.py
    ├── dockless
    │   ├── create_la_boundary.sql
    │   ├── dockless_elt.py
    │   ├── geom_parse.sql
    │   ├── indexes.sql
    │   ├── README.md
    │   ├── scooter-stat.py
    │   └── v_trips.sql
    ├── metro
    │   └── ridership.py
    └── waze
        ├── dag-waze-dataProcessor.py
        └── store_data_file_nologin.py
hunterowens commented 4 years ago

Need to migrate

  1. homelessness/sync-rap-data.py
  2. MyLA311 can be deleted once we merge #254
  3. care-311-to-postgres should just be a view off of #254
  4. publichealth/code55.py / code75.py - possible to drop. @ian-r-rose to follow up with Oscar.
  5. publichealth/covid19 don't migrate
  6. transportation/bikeshare/trips.py @ian-r-rose to migrate
  7. transportation/dash/trip.py @ian-r-rose
  8. transportation/dockless @hunterowens to wait for @thekaveman to finish up library updates, then rebuild DB.
  9. metro/ridership.py @ian-r-rose to migrate
  10. waze is currently thousands of files split across two s3 buckets. Hunter to figure out long term how to host in Redshift, but fine to eliminate the transfer DAG for now.