This PR adds an optional initialization argument to the Ed-Fi Resource DAGs: schedule_interval_full_refresh. This adds a new user-defined-macro to the DAG that is referenced in airflow_util.is_full_refresh. When the datetime of the DAG run matches the cron expression in this variable, the full-refresh parameter will default to True, regardless of DAG-level configs.
Note: this means it is not possible to run an incremental run on days where the cron syntax holds true. We should prioritize runs like these on weekends.
Note: this uses the croniter library to perform the CRON expression match. This requirement has been added to setup.py.
PR Merge Priority:
[x] Low
[ ] Medium
[ ] High
Changes to existing files:
edu_edfi_airflow/dags/dag_util/airflow_util.py: Add run_matches_cron() helper to compare the DAG run logical date with a specified cron expression; overload is_full_refresh() to check for this macro first before returning the param value.
edu_edfi_airflow/dags/edfi_resource_dag.py: Add the schedule_interval_full_refresh argument and is_scheduled_full_refresh UDM to EdFiResourceDAG.
Feature: Schedule Interval Full Refresh
Description & motivation
This PR adds an optional initialization argument to the Ed-Fi Resource DAGs:
schedule_interval_full_refresh
. This adds a new user-defined-macro to the DAG that is referenced inairflow_util.is_full_refresh
. When the datetime of the DAG run matches the cron expression in this variable, the full-refresh parameter will default to True, regardless of DAG-level configs.Note: this means it is not possible to run an incremental run on days where the cron syntax holds true. We should prioritize runs like these on weekends.
Note: this uses the
croniter
library to perform the CRON expression match. This requirement has been added tosetup.py
.PR Merge Priority:
Changes to existing files:
edu_edfi_airflow/dags/dag_util/airflow_util.py
: Addrun_matches_cron()
helper to compare the DAG run logical date with a specified cron expression; overloadis_full_refresh()
to check for this macro first before returning the param value.edu_edfi_airflow/dags/edfi_resource_dag.py
: Add theschedule_interval_full_refresh
argument andis_scheduled_full_refresh
UDM to EdFiResourceDAG.