theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
34 stars 16 forks source link

Adding `assembly_mean_coverage` metrics for flu in TheiaCoV_Illumina_PE_PHB #314

Closed jrotieno closed 6 months ago

jrotieno commented 6 months ago

This PR closes #313.

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

This PR adds the assembly_mean_coverage metrics for influenza HA and NA in the TheiaCoV_Illumina_PE_PHB workflow which was previously only available for other pathogens.

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes (An output in assembly_mean_coverage where organism is "flu" that was previously empty)

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : Yes (Though not as a result of this specific PR)

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed: N/A

Databases or database versions changed: N/A

Data processing/commands changed: The workflow now calls the assembly metric task when sample HA and/or NA bam files from IRMA are available.

File processing changed: IRMA task now also outputs HA and NA bam files.

Compute resources changed: N/A

➡️ Inputs

Not changed

⬅️ Outputs

Now when organism is "flu", instead of the workflow returning an empty output for assembly_mean_coverage, there will be a value HA:<coverage>, NA:<coverage>

:test_tube: Testing

Test Dataset

A set of four samples were used:

  1. Influenza B Yamagata
  2. Influenza A H3N2
  3. Influenza A H1N1
  4. Influenza A H1N1 with only NA subtyped, i.e. expecting HA to fail

Commandline Testing with MiniWDL or Cromwell (optional)

Local testing done for the IRMA task and works as expected.

Terra Testing

Testing done here: https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/Global_tree_testing/job_history/7189276f-f2b7-48be-b64d-3426353d087e

Suggested Scenarios for Reviewer to Test

Subtypes other than those listed here.

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

🗂️ Associated Documentation (to be completed by Theiagen developer)