Read file names should end with .fastq or .fq, with the optional addition of .gz
Unfortunatly, this is no longer true since #286 as the addition of Kraken2 module made the workflow uncompatible with uncompressed files.
TheiaMeta is another workflow that is also not currently compatible with uncompressed FASTQs due to Kraken2 standalone task incompatibility.
This PR addresses this issue by allowing uncompressed FASTQ files to be processed by Kraken2.
Additionally, the Kraken2 standalone workflows now allow for uncompressed input files.
:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made
This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : No
Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No
Kraken2_standalone task has been adjusted to allow for uncompressed FASTQ file processing.
:clipboard: Workflow/Task Step Changes
🔄 Data Processing
Docker/software or software versions changed: N/A
Databases or database versions changed: N/A
Data processing/commands changed: N/A
File processing changed: N/A
Compute resources changed: N/A
➡️ Inputs
Nothing was altered
⬅️ Outputs
Nothing was altered
:test_tube: Testing
Test Dataset
Locally:
1 uncompressed metagenomic HAV sample
On Terra:
Uncompressed SRR19880611
Commandline Testing with MiniWDL or Cromwell (optional)
This PR closes #224
🗑️ This dev branch should be deleted after merging to main.
:brain: Aim, Context and Functionality
Most of our workflows in the PHB universe ™️ are meant to be compatible with both compressed and uncompressed FASTQ files. Such is the example of TheiaProk (https://theiagen.notion.site/TheiaProk-Workflow-Series-2c710e386ea74dbc828a910b6fb77fac)
Unfortunatly, this is no longer true since #286 as the addition of Kraken2 module made the workflow uncompatible with uncompressed files.
TheiaMeta is another workflow that is also not currently compatible with uncompressed FASTQs due to Kraken2 standalone task incompatibility.
This PR addresses this issue by allowing uncompressed FASTQ files to be processed by Kraken2.
Additionally, the Kraken2 standalone workflows now allow for uncompressed input files.
:hammer_and_wrench: Impacted Workflows/Tasks & Changes Being Made
This will affect the behavior of the workflow(s) even if users don’t change any workflow inputs relative to the last version : No
Running this workflow on different occasions could result in different results, e.g. due to use of a live database, "latest" docker image, or stochastic data processing : No
Kraken2_standalone task has been adjusted to allow for uncompressed FASTQ file processing.
:clipboard: Workflow/Task Step Changes
🔄 Data Processing
Docker/software or software versions changed: N/A
Databases or database versions changed: N/A
Data processing/commands changed: N/A
File processing changed: N/A
Compute resources changed: N/A
➡️ Inputs
Nothing was altered
⬅️ Outputs
Nothing was altered
:test_tube: Testing
Test Dataset
Locally:
On Terra:
Commandline Testing with MiniWDL or Cromwell (optional)
Kraken2 Standalone task:
miniwdl run --task kraken2_standalone /home/ines_mendes/Git/public_health_bioinformatics/tasks/taxon_id/task_kraken2.wdl read1= ~/Test/ont/ERR3772599.fastq kraken2_db="gs://theiagen-public-files-rp/terra/theiaprok-files/k2_viral_20230605.tar.gz" samplename=ERR3772599
TheiaMeta workflow:
miniwdl run /home/ines_mendes/Git/public_health_bioinformatics/workflows/metagenomics/wf_theiameta_illumina_pe.wdl read1= ~/Test/HAV_Metagenomics/HAV0001_S1_L001_R1_001.fastq read2= ~/Test/HAV_Metagenomics/HAV0001_S1_L001_R2_001.fastq samplename=TEST kraken2_db="gs://theiagen-public-files-rp/terra/theiaprok-files/k2_viral_20230605.tar.gz"
Kraken2 standalone workflow:
miniwdl run /home/ines_mendes/Git/public_health_bioinformatics/workflows/standalone_modules/wf_kraken2_pe.wdl samplename="TEST" kraken2_db="gs://theiagen-public-files-rp/terra/theiaprok-files/k2_viral_20230605.tar.gz" read1= /home/ines_mendes/Test/HAV_Metagenomics/HAV0001_S1_L001_R1_001.fastq read2= /home/ines_mendes/Test/HAV_Metagenomics/HAV0001_S1_L001_R2_001.fastq
TheiaProk_illumina_pe workflow: Unable to test locally
Terra Testing
Suggested Scenarios for Reviewer to Test
Theiagen Version Release Testing (optional)
:microscope: Final Developer Checklist
🎯 Reviewer Checklist
🗂️ Associated Documentation (to be completed by Theiagen developer)