Open lawal-hash opened 2 months ago
Having different DAG for extract and load enables easy management of ingestion of the tables. It also allow tables to have different ingestion schedules. As business requirements may require you load some tables hourly, daily and weekly.
Doing all extract and load in parallel especially when you have a lot of tables may not be practical in prod environment. Managing your compute resources will be difficult.
We can use the task group and dynamic mapping API in airflow, this ensures we have a single DAG and each table task (extract-load) is done in parallel, which archives the same objective as what you had implemented. However, I believe this suggestion will enforce DRY.