microbiomeDB / mbio_airflow_dags

A collection of DAGs for running MicrobiomeDB data pipelines using Apache Airflow
Apache License 2.0
1 stars 1 forks source link

draft mag dag #3

Closed d-callan closed 1 month ago

d-callan commented 3 months ago

this does things like download reference dbs, copy config to pmacs, start fetchngs (if needed) + mag, metatdenovo and taxprofiler running on pmacs once for each study in some list, then copy results back for post- processing on the local machine. someday this can grow to include funcscan and phageannotator.

its building the dag up manually, to avoid some difficulty encountered using task groups and dynamic mapping together. if it turns out we have found a good, reliable, readable and more airflow-y solution than that, we can refactor later.

its tested and confirmed working w two test datasets, one that requires fetchngs and another that doesnt, and also w one real dataset (ResistomeI)