breedfides / airflow-etl

0 stars 2 forks source link

Implement an orchestrator DAG (Primary DAG) that both handles API response and triggers other DAGs based on parameters #7

Closed brightemahpixida closed 9 months ago

brightemahpixida commented 9 months ago

Expected behaviour

There is a ready-to-use DAG that is triggered by POST request made on airflow via the frontend with Geolocation attributes as part of its response Payload, this DAG then initiates subsequent ETL workflows on other DAGs based on the payload attributes.

A quick run-through of the expected workflow can be seen on the diagram below:

new_issue

brightemahpixida commented 9 months ago

Here's a draft sample of the curl POST request that will be sent from the frontend to airflow's API which in turn will trigger the "primary-DAG"

curl -H "Content-type: application/json" \
     -H "Accept: application/json" \
     -X POST --user "**AIRFLOW-USERNAME**:**AIRFLOW-PASSWORD**" \
     "http://www.breedfides-airflow.bi.denbi.de/api/v1/dags/**{DAG-ID}**/dagRuns" \
     -d '**{GEOLOCATION-PAYLOAD}**'
brightemahpixida commented 9 months ago

@gannebamm hi, here's the ticket as discussed last week during our meeting, i was able to include the architectural overview of the ETL workflow as part of its description

brightemahpixida commented 9 months ago

@vineetasharma105 @gannebamm Hi, i just created the PR for the primary DAG - the associated PR is linked to this issue. I was able to implement the workflow on how this DAG will receive the POST requests and trigger7initiate the execution of other dependent DAGs:

In the next PR which i will raise later on, I will focus on the clipping workflow for the data source you included last week (i.e. the radiation, humidity, air_temp, etc).

gannebamm commented 6 months ago

Just some documentation: