theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

Add VADR to flu and RSV TheiaCoV #384

Closed cimendes closed 3 months ago

cimendes commented 3 months ago

This PR closes #295

๐Ÿ—‘๏ธ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

This PR adds VADR to two new organisms:

Due to the limitations of the RSV model the VADR memory parameter has been exposed in the organisms parameter sub-workflow.

Additionally, the VADR container has been upgraded to the latest version: us-docker.pkg.dev/general-theiagen/staphb/vadr:1.6.3

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users donโ€™t change any workflow inputs relative to the last version : Yes (Docker container has been updated)

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

:clipboard: Workflow/Task Step Changes

๐Ÿ”„ Data Processing

Docker/software or software versions changed: Yes, changed to us-docker.pkg.dev/general-theiagen/staphb/vadr:1.6.3

Databases or database versions changed: N/A

Data processing/commands changed: N/A

File processing changed: N/A

Compute resources changed: Memory for VADR has been adjusted depending on the organism (8 for all organisms except for RSV-A and RSV-B which is set to 32)

โžก๏ธ Inputs

None have been changed

โฌ…๏ธ Outputs

None has been changed

:test_tube: Testing

Test Dataset

Commandline Testing with MiniWDL or Cromwell (optional)

Terra Testing

Flu
RSV

Suggested Scenarios for Reviewer to Test

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

๐ŸŽฏ Reviewer Checklist

๐Ÿ—‚๏ธ Associated Documentation (to be completed by Theiagen developer)

kapsakcj commented 3 months ago

Nice work on this! I need some more time looking at this, but so far I've looked at your VADR on Flu tests on TheiaCoV_Illumina_PE

Thoughts so far:

Curtis TODO:

kapsakcj commented 3 months ago

Hmmmm I think we should enable the theiacov_Illumina_pe and se workflows in the CI for miniwdl and ensure they are updated. If we don't we're going to have CI failures after merging.

Similar to changes I've made in https://github.com/theiagen/public_health_bioinformatics/pull/375/files

I'm happy to do this, but let me know if for some reason you think that is a bad idea.

EDIT: I updated the CI, see below commits.

kapsakcj commented 3 months ago

VADR on RSV appears to be running appropriately as well. There were a few samples that passed w/ 0 VADR alerts and some that failed with 2 VADR alerts (SRR19134713)

Here are some tests I ran: