refactor: genetic dags - Githubissues

Context

This PR closes https://github.com/opentargets/orchestration/issues/27

The aim of this PR is to unify existing dags (except the genetics_etl) to reuse existing approach for generate_dag logic implemented for the genetics_etl that creates the topology of the dag based on configuration file.

This process streamlines the dependency management and allows for better understanding of the dependencies between the DAG steps.

Previous implementation had configuration distributed accross multiple files in the config directory. This way the configuration was not isolated for each DAG, resulting in heavy lookup into the nested structures of the configs and dags code to understand the overall processes.

By merging configuration of multiple gentropy steps and extracting this config as a single entity called dag config that is stored under the src/ot_orchestration/dags/config/*.yaml should increase the readability and verbosity of each process. Enabiling the nodes and prerequisites in most cases allows to skip on reading the logic of the DAG itself and focus on the process definition maintained in the dag config.

Things implemented:

Refactoring of ukb_ppp_eur_harmonisation DAG
Refactoring of gwas_curation_update DAG
Refactoring of gwas_catalog_preprocess DAG
Refactoring of gnomad_ingestion DAG
Deprecation of gwas_catalog_harmonisation DAG -> the content is under development of gwas_catalog_pipeline DAG
Refactoring of finngen_ukb_meta_harmonisation DAG
Refactoring of finngen_ingestion DAG + addition of extra parameter sample_size
Refactoring of eqtl_ingestion DAG.
New bunch of tests for utils
Refactoring of dataproc releated functions.
Fixes to development process
- Allow for shell to be inferred from env variable, so after running make dev the bashrc file is not populated with junk lines,
- Remove sourcing of poetry shell as default from setup script,
- Fix duplicate pre-commit call that causes the pre-commit to run twice

opentargets / orchestration

refactor: genetic dags #26

Context

Things implemented: