edanalytics / edu_edfi_airflow

Manages extract-load of Ed-Fi data in Airflow
Other
4 stars 0 forks source link

Feature/schedule interval full refresh #29

Closed jayckaiser closed 5 months ago

jayckaiser commented 8 months ago

Feature: Schedule Interval Full Refresh

Description & motivation

This PR adds an optional initialization argument to the Ed-Fi Resource DAGs: schedule_interval_full_refresh. This adds a new user-defined-macro to the DAG that is referenced in airflow_util.is_full_refresh. When the datetime of the DAG run matches the cron expression in this variable, the full-refresh parameter will default to True, regardless of DAG-level configs.

Note: this means it is not possible to run an incremental run on days where the cron syntax holds true. We should prioritize runs like these on weekends.

Note: this uses the croniter library to perform the CRON expression match. This requirement has been added to setup.py.

PR Merge Priority:

Changes to existing files: