theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[TheiaProk_ONT and TheiaCoV_ONT] Expose additional QC metrics from nanoplot for both raw and clean reads #452

Closed cimendes closed 1 month ago

cimendes commented 2 months ago

This PR partially closes #355

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

This PR exposes a series of read quality metrics, computed by nanoplot, for both TheiaCoV and TheiaProk ONT workflows.

Taxon tables has been adjusted to include the new output metrics.

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes/No

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : Yes/No

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed: None

Databases or database versions changed: None

Data processing/commands changed: None

File processing changed: None

Compute resources changed: None

➡️ Inputs

~ None changed #### ⬅️ Outputs

New outputs:

    String? nanoplot_docker
    File? nanoplot_html_raw
    File? nanoplot_tsv_raw
    Int? nanoplot_num_reads_raw1
    Float? nanoplot_r1_median_readlength_raw (new)
    Float? nanoplot_r1_mean_readlength_raw
    Float? nanoplot_r1_stdev_readlength_raw (new)
    Float? nanoplot_r1_n50_raw (new)
    Float? nanoplot_r1_mean_q_raw
    Float? nanoplot_r1_median_q_raw (new)
    Float? nanoplot_r1_est_coverage_raw (new)
    File? nanoplot_html_clean
    File? nanoplot_tsv_clean
    Int? nanoplot_num_reads_clean1
    Float? nanoplot_r1_median_readlength_clean (new)
    Float? nanoplot_r1_mean_readlength_clean
    Float? nanoplot_r1_stdev_readlength_clean (new)
    Float? nanoplot_r1_n50_clean (new)
    Float? nanoplot_r1_mean_q_clean
    Float? nanoplot_r1_median_q_clean (new)
    Float? nanoplot_r1_est_coverage_clean (new)

:test_tube: Testing

Test Dataset

Commandline Testing with MiniWDL or Cromwell (optional)

Terra Testing

Suggested Scenarios for Reviewer to Test

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

🎯 Reviewer Checklist

🗂️ Associated Documentation (to be completed by Theiagen developer)

sage-wright commented 2 months ago

Testing TheiaProk here and TheiaCoV here

sage-wright commented 2 months ago

outputs are produced as expected! waiting to approve after confirmation that we actually do want all of these outputs output