cagov / data-orchestration

Orchestration tooling for the CalData Data Services and Engineering team
MIT License
0 stars 0 forks source link

Add pipeline for CalInnovate feedback form #4

Closed ian-r-rose closed 1 year ago

ian-r-rose commented 1 year ago

We had a bit of a surprise yesterday when we learned that a Syntasa pipeline was being used by CDPH to look at data from the feedback form on covid19.ca.gov, and that it hadn't been running since Dec 8, 2022.

Incident response doc

Short term response

We need to re-create that pipeline within the CalData stack. A minimal version would is to:

  1. Extract data from this JSON endpoint
  2. Load the data into the BQ table of record (we may want to update what table that is exactly)
  3. Capture the above in a DAG running in airflow (this repo)
  4. Update the data studio dashboard to point at the new table (if necessary)

Medium term response