breedfides / airflow-etl

0 stars 2 forks source link

Implement Soil Density data Fetching/Clipping DAG #12

Closed brightemahpixida closed 6 months ago

brightemahpixida commented 8 months ago

Overview: This pull request adds a new DAG (fetch_gpkg_soil_data_DAG) to the ETL pipeline - the role of this DAG is centered on:

Changes Made:

brightemahpixida commented 8 months ago

New Commit Update - b82a8aa:

I include a new file airflow.cfg-dev-test.txt that has all the configurations to execute a DAG-RUN on dev, the current airflow.cfg file contains config for the prod environment. I also updated the readme

brightemahpixida commented 7 months ago

@gannebamm The feedback made on #9 is currently on this commit (along with the ones made here)

brightemahpixida commented 7 months ago

Hi @gannebamm

Just wanted to inform you that i've made a couple of adjustments on the DAGs (as seen on my most recent commit). This update was made to address the error you pointed out on your last comment, which i believe was due to all three DAGs running simultaneously which affected the CPU resources on the airflow instance.

What i did now was adjust the execution flow in a way that each DAG runs sequentially, meaning that the non-active DAGs are kept in a backlog while they wait for the current active DAG to finish its execution. I also made sure to start each iteration with the most resource intensive DAG which is the soil-DAG, followed by the radiation and air-temp DAG

gannebamm commented 6 months ago

@brightemahpixida By merging this PR the other PRs can be closed, right?