This release candidate aligns with versions recently released in ea_airflow_util. This PR introduces the following:
Add /keyChanges ingestion for resource endpoints
Add new method for EdFiResourceDAG endpoint instantiation using resource_configs and descriptor_configs arguments in init
The prior methods EdFiResourceDAG.{add_resource, add_descriptor, add_resource_deletes} are deprecated in favor of this more performant approach.
Refactor EdFiToS3Operator taskgroup into three options (determined by run_type argument):
"default": One EdFiToS3Operator task per resource/deletes/keyChanges endpoint
"bulk": One BulkEdFiToS3Operator task in which all endpoints are looped over in one callable
"dynamic": One dynamically-mapped EdFiToS3Operator task per resource with deltas to ingest
Copies from S3 to Snowflake in EdFiResourceDAG are now completed in a single bulk task (instead of one per endpoint)
EdFiResourceDAG now inherits from ea_airflow_util DAG factory EACustomDAG
Streamline XCom passing between tasks in EdFiResourceDAG
PR Merge Priority:
[ ] Low
[x] Medium
[ ] High
Changes to existing files:
This is a complete refactor of EdFiResourceDAG and all operators and callables that are called in it. The DAG has been fully refactored to dramatically reduce the number of tasks that are created and run. Three Ed-Fi task-group options are provided depending on the volume of data and number of DAGs being run in a given implementation.
New files created:
Tests and QC done:
This has been tested in South Carolina and GSN dev. All combinations of DAG configs, run-types, and change-version toggling have been tested.
EDU Ed-Fi Airflow Release Candidate v0.3.0
Description & motivation
This release candidate aligns with versions recently released in
ea_airflow_util
. This PR introduces the following:EdFiResourceDAG
now inherits fromea_airflow_util
DAG factoryEACustomDAG
EdFiResourceDAG
PR Merge Priority:
Changes to existing files:
This is a complete refactor of
EdFiResourceDAG
and all operators and callables that are called in it. The DAG has been fully refactored to dramatically reduce the number of tasks that are created and run. Three Ed-Fi task-group options are provided depending on the volume of data and number of DAGs being run in a given implementation.New files created:
Tests and QC done:
This has been tested in South Carolina and GSN dev. All combinations of DAG configs, run-types, and change-version toggling have been tested.