CityofPittsburgh / data-rivers

Apache Airflow and Beam ETL scripts for the City of Pittsburgh's data analysis pipelines
10 stars 1 forks source link

Accomodate Removal of Qalert Run Logging, Sort Recent Blobs #732

Closed jasonfic closed 4 months ago

jasonfic commented 4 months ago

Previously, the Qalert Submitters DAG used the contents of the successful run log produced by the mainline qalert_requests DAG to find the most recent batch of requests in order to query the people who submitted them. However, the successful run log was removed in a recent update. This PR seeks to use the google.cloud.storage Python library to sort through blobs uploaded in our Qalert GCS bucket using a prefix containing the current run date to find the most recent uplaod.