ccao-data / data-architecture

Codebase for CCAO data infrastructure construction and management
https://ccao-data.github.io/data-architecture/
6 stars 4 forks source link

Update QC town close export script to prep for automation #626

Closed jeancochrane closed 2 weeks ago

jeancochrane commented 1 month ago

This PR makes a few quality-of-life improvements to the QC town close export script to enable automation via any distribution mechanism (VM, OneDrive, or S3). The changes include:

With these changes, we can follow these steps to define an automated process to export QC reports on a regular schedule:

jeancochrane commented 2 weeks ago

@dfsnow I finished a pretty major refactor of the PR, so I think it's worth taking a fresh look at everything. Instead of the no-filter default being a set of active towns determined by a schedule, the script will now default to exporting reports for all towns, using pandas to speed up the township filtering operation. Testing this locally, it takes about 30 minutes to run an export for all towns, and we scan the same amount of data in Athena since our tables are not partitioned by township anyway.