theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
37 stars 17 forks source link

[TheiaProk Workflows] Add Kraken2 as optional module #286

Closed cimendes closed 9 months ago

cimendes commented 10 months ago

Closes #279

:hammer_and_wrench: Changes Being Made

This PR adds Kraken2 as an optional module in the TheiaProk suite of workflows (Illumina PE and SE). Once PR #240 gets merged, Kraken2 will be possible to be added as an optional module for TheiaProk_ONT

Impacted Workflows/Tasks

:brain: Context and Rationale

Requested to facilitate contamination detection. A database must be provided to the module.

:clipboard: Workflow/Task Steps

Kraken2 was integrated on to the Trim SE and Trim PE sub-workflows

Inputs

New optional inputs:

Boolean call_kraken = false
File? kraken_db

Outputs

New optional outputs for TheiaProk:

String? kraken2_version = read_QC_trim.kraken_version
File? kraken2_report = read_QC_trim.kraken_report
String? kraken2_docker = read_QC_trim.kraken_docker

Impacted Outputs

:test_tube: Testing

Underway

Locally

Not tested locally

Terra

Kraken2 Database used: gs://theiagen-public-files-rp/terra/theiaprok-files/k2_standard_08gb_20230605.tar.gz

Scenarios for Reviewer to Test

Check for new functionality:

Check that no conflicts were created:

Pull Request (PR) checklist:

michellescribner commented 9 months ago

Launched TheiaProk tests: PE: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/e669dc0d-d130-48f3-868c-3907c1ee431a SE: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/6cfa9dcf-6189-42ad-be32-6335a1795cdb

Launched TheiaCoV function tests: TheiaCoV_Illumina_PE_PHB: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/7749f154-531d-43d9-ae11-3ae8275dc6ed TheiaCoV_Illumina_SE_PHB: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/6676a91d-d806-4e15-9c53-36500e86e0ed

Also gave TheiaProk_ONT_PHB a function test: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/8669d358-ad9d-48fc-912f-ee67b23a9488

kevinlibuit commented 9 months ago

Seemingly an issue with a required input and not changes made in this PR. Relaunching some of @michellescribner's tests with inputs defined to verify:

michellescribner commented 9 months ago

Last minute I also launched a quick function test for TheiaProk without implementing the kraken module https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/2f7adb95-9df7-4ca6-9ffe-ee0efbd04711

cimendes commented 9 months ago

The documentation has been updated to reflect this merge!