opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Refactor existing Airflow implementation #3104

Closed tskir closed 11 months ago

tskir commented 12 months ago

Feature ticket under https://github.com/opentargets/issues/issues/3028.

The existing Airflow code under src/airflow needs to be refactored to separate out the common functions (create cluster, submit PySpark job, delete cluster) and configuration (version, project, paths). This will prevent code duplication and divergence of the existing Airflow code vs. new one I'm writing for the Preprocess pipeline.