theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

DOC: Provide basic info on how to run a workflow locally with Cromwell or miniWDL #137

Open corneliusroemer opened 11 months ago

corneliusroemer commented 11 months ago

I'm trying to help @jrotieno debug the mpxv workflow but I don't know how to run a workflow locally.

The README currently states:

These workflows are written in WDL, a language for specifying data processing workflows with a human-readable and writeable syntax. They have been developed by Theiagen Genomics to primarily run on the Terra.bio platform but can be run locally or on an HPC system at the command-line with Cromwell or miniWDL.

I want to run the workflow locally, but couldn't find any information on how to run a workflow using Cromwell or miniWDL.

It would be amazing if you could add basic documentation on how to do this along the lines of:

# Make sure cromwell is installed
cromwell run workflows/phylogenetics/wf_augur.wdl

similar to say how we document snakemake workflows:

Run analysis pipeline

Run pipeline to produce "overview" tree for /monkeypox/mpxv with:

nextstrain build --docker --cpus 1 . --configfile config/config_mpxv.yaml

https://github.com/nextstrain/monkeypox#run-analysis-pipeline

kapsakcj commented 11 months ago

Hi Cornelius,

Thanks for the suggestion, you are right we need to bolster documentation around running our workflows via miniwdl or cromwell on the command line. We have some basic cromwell instructions in Supplemental file 11 (PDF) here: https://www.microbiologyresearch.org/content/journal/mgen/10.1099/mgen.0.001051#supplementary_data

Though it may be easier for you to install miniwdl and test the WDL workflows and/or tasks that way. It's easily installable via conda and I find it a bit easier to use than cromwell. I would only go the cromwell route if you're planning to run things on an HPC, but for local testing on a single machine miniwdl is the better choice IMO.

I'll circle back in a bit with an example on how to get up and running with the miniwdl and wf_augur.wdl that you mentioned