Exposing r1 and r2 mean_q_clean and mean_readlength_clean

This PR closes #439 .

🗑️ This dev branch should be deleted after merging to main.

:brain: Aim, Context and Functionality

This PR exposes the mean quality scores for reads 1 and 2, i.e. , r1_mean_q_clean and r1_mean_q_clean, and mean clean readlengths for reads 1 and 2, i.e. r1_mean_readlength_clean and r2_mean_readlength_clean. These outputs were computed by the TheiaProk Illumina PE workflow but not exposed on Terra.

For TheiaProk ONT workflow, I don't know if we want to change the outputs such as nanoplot_r1_mean_q_clean to r1_mean_q_clean for coherence with the Illumina and SE workflows. Rationale being that for PE and SE, we do not prefix with cg_pipeline that generates the metrics whereas this is done with nanoplot for ONT

:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made

This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : Yes, new outputs

Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No

:clipboard: Workflow/Task Step Changes

🔄 Data Processing

Docker/software or software versions changed: No

Databases or database versions changed: No

Data processing/commands changed: No

File processing changed: No

Compute resources changed: No

➡️ Inputs

⬅️ Outputs

r1_mean_q_clean r2_mean_q_clean r1_mean_readlength_clean r2_mean_readlength_clean

:test_tube: Testing

Test Dataset

A random set of two V. cholerae samples

Commandline Testing with MiniWDL or Cromwell (optional)

Terra Testing

TheiaProk Illumina PE: https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Otieno_Sandbox/job_history/0ea0da1f-926d-459c-88d7-e90084f86a92

Suggested Scenarios for Reviewer to Test

This is pretty straightforward and does not need extensive testing, but the reviewer may test a scenario when the clean reads screen is expected to fail, cg_pipeline_clean is not run, and these outputs should not have any values.

Theiagen Version Release Testing (optional)

:microscope: Final Developer Checklist

[ ] The workflow/task has been tested locally and results, including file contents, are as anticipated
[X] The workflow/task has been tested on Terra and results, including file contents, are as anticipated
[x] The CI/CD has been adjusted and tests are passing (to be completed by Theiagen developer)
[X] Code changes follow the style guide

🎯 Reviewer Checklist

[ ] All impacted workflows/tasks have been tested on Terra with a different dataset than used for development
[ ] All reviewer-suggested scenarios have been tested and any additional
[ ] All changed results have been confirmed to be accurate
[ ] All workflows/tasks impacted by change/s have been tested using a standard validation dataset to ensure no unintended change of functionality
[ ] All code adheres to the style guide
[ ] MD5 sums have been updated
[ ] The PR author has addressed all comments

🗂️ Associated Documentation (to be completed by Theiagen developer)

[x] Relevant documentation on the Public Health Resources "PHB Main" has been updated
[ ] Workflow diagrams have been updated to reflect changes

theiagen / public_health_bioinformatics