theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[TheiaCoV] Enable user to run TheiaCoV with an unsupported organism #501

Closed sage-wright closed 2 weeks ago

sage-wright commented 2 weeks ago

This PR closes #500.

πŸ—‘οΈ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

Sometimes you want to run TheiaCoV on an organism it doesn't natively support yet. This PR lets you do that.

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : No

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

:clipboard: Workflow/Task Step Changes

πŸ”„ Data Processing

If a standardized organism name is not found, the user-provided organism will be used instead, which will likely cause any organism-specific steps in the workflow to be skipped as the provided name is not recognized.

:test_tube: Testing

Terra Testing

Tested here to successful skipping of organism-specific tasks.

Suggested Scenarios for Reviewer to Test

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

πŸ—‚οΈ Associated Documentation (to be completed by Theiagen developer)

kapsakcj commented 2 weeks ago

tested successfully with organism set to "measles" here: https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/CDPH_Smith_Sandbox/job_history/739b6d35-7b58-43c7-91c3-7dd533d64edf

all organism-specific tasks were skipped as expected, just the ivar variant calling/consensus assembly steps & the consensus_qc steps were run πŸ‘

Will test with a sars-cov-2 and Mpox Illumina PE samples to ensure other organisms are unaffected by this change.

This code change will impact all workflows that use the organism_parameters subwf (augur, all theiacov workflows including fasta_batch, pangolin_update) but I don't think we need to do super extensive testing

kapsakcj commented 2 weeks ago

βœ… Mpox Illumina PE test: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/ccf1ba77-16a3-47a3-89d8-bc3683b978b7

βœ… sars-cov-2 illumina SE test: https://app.terra.bio/#workspaces/theiagen-validations/curtis-sandbox-theiagen-validations/job_history/104dfe08-3f39-4888-ae30-d5cf31154f2b