theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
36 stars 17 forks source link

HOSTILE: For TheiaCoV_Illumina_PE_PHB and TheiaCoV_ONT_PHB #256

Closed jrotieno closed 9 months ago

jrotieno commented 9 months ago

NOTE: Merge https://github.com/theiagen/public_health_bioinformatics/pull/202 before merging this branch!!

:hammer_and_wrench: Changes Being Made

Adding a new hostile task as the human reads removal (dehosting) tool for TheiaCoV_ONT_PHB, and an optional dehosting tool for TheiaCoV_Illumina_PE_PHB.

Impacted Workflows/Tasks

task_hostile wf_read_QC_trim_pe TheiaCoV_Illumina_PE_PHB wf_read_QC_trim_ont TheiaCoV_ONT_PHB

:brain: Context and Rationale

The current dehosting tool ncbi-scrub only works with illumina paired end reads, whereas hostile works with both illumina and oxford nanopore (ONT) reads. Therefore, we are implementing hostile for dehosting of reads in the TheiaCoV_ONT_PHB workflow.

For TheiaCoV_Illumina_PE_PHB, we are giving users the option to use hostile if so preferred.

:clipboard: Workflow/Task Steps

The hostile task takes as input the variable seq_method to determine if working with or ONT reads. By default, these are set in the TheiaCoV_Illumina_PE_PHB and TheiaCoV_ONT_PHB workflows as ILLUMINA and OXFORD_NANOPORE, respectively.

Inputs

TheiaCoV_ONT_PHB: Inputs remain the same TheiaCoV_Illumina_PE_PHB: Added an optional input dehosting_tool which takes in the options ncbi_scrub (default) or hostile

Outputs

Remain the same

Impacted Outputs

:test_tube: Testing

Locally

Works as expected locally for the TheiaCoV_ONT_PHB and TheiaCoV_Illumina_PE_PHB workflows

Terra

Works as expected for TheiaCoV_ONT_PHB: https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/Global_tree_testing/job_history/f84b2815-6ef5-4ba2-9da6-ba27e49b0a9f SARS-CoV-2 and TheiaCoV_Illumina_PE_PHB: With dehosting_tool set to hostile: https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/Global_tree_testing/job_history/d731282a-07c6-4aca-8015-f15c5c7dd2c8 Influenza With dehosting_tool blank or the default ncbi_scrub: https://app.terra.bio/#workspaces/cdph-terrabio-taborda-manual/Global_tree_testing/job_history/8c8e1e82-cf14-4ad2-a872-011d03248bc2 influenza

Scenarios for Reviewer to Test

Test for pathogens other than SARS-CoV-2 and Influenza, and with samples rich in human reads.

:microscope: Quality checks

Pull Request (PR) checklist: