opentargets / orchestration

Open Targets pipeline orchestration layer
Apache License 2.0
1 stars 0 forks source link

chore(eqtl_catalogue): configure step to run on cluster mode #49

Closed ireneisdoomed closed 1 month ago

ireneisdoomed commented 1 month ago

We have new eQTL Catalogue data after applying these fixes https://github.com/opentargets/gentropy/pull/849 that remove the duplication on the locus. Job took 30min. A 33% increase compared to last time. New credible sets: gs://eqtl_catalogue_data/credible_set_datasets/eqtl_catalogue_susie

The only thing I had to change in the config was to tell Spark to use yarn. Other than that, generating the data was one click on Airflow.

ireneisdoomed commented 1 month ago

@project-defiant I just saw this https://github.com/opentargets/orchestration/pull/40 Does that make this change unnecessary?

project-defiant commented 1 month ago

@project-defiant I just saw this #40 Does that make this change unnecessary?

Hopefully not anymore :)