theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
37 stars 17 forks source link

[TheiaCoV docs] Missing section on optional qc_check task #572

Open kapsakcj opened 1 month ago

kapsakcj commented 1 month ago

:cool:

:pushpin: Explain the Request

The TheiaCoV workflow documentation page is missing a section on the optional QC_check task. Here's the v2.1.0 TheiaCoV page: https://theiagen.notion.site/TheiaCoV-Workflow-Series-47e3269a932343be982507d72fcb0fbe

It exists on the TheiaProk docs page, but not on TheiaCoV docs page.

The qccheck task is implemented in TheiaCoV ONT, ClearLabs, ILMN PE, ILMN SE, and FASTA

It would be good to have a section describing the task and additionally host a template TSV that users can use with TheiaCoV with relevant QC parameters specific to different organisms (sars-cov-2, Flu, Mpox, etc.)

kapsakcj commented 1 month ago

Here's an example qc_check input TSV that I used successfully with sars-cov-2 samples and the TheiaCoV_Illumina_PE workflow. The QC values are pretty lax, they could be adjusted.

qc_check_table_theiacov_illumina_pe_template.tsv.txt

kapsakcj commented 1 month ago

The other thing that I will note that is important is that for first column taxon, it is important to have the syntax/spelling match what is used in the theiacov organism input parameter.

So for example, if setting up the qc_check table for West Nile Virus, the user should use WNV in both the first column of qc_check table use "WNV" for the organism input parameter.

jrotieno commented 3 weeks ago

Docs updated: https://www.notion.so/theiagen/TheiaCoV-Workflow-Series-e62e74b44bd048dba3ae49115a9998ba?pvs=4#d3f73e5c124a40da935112e8cbd23845

Terra test with various TheiaCoV organism input and the template qc check table here: https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Otieno_Sandbox/job_history/19ffbacf-90f8-4069-8ffc-ad744d03b94d

Some samples expected to fail read screen, others expected to run to the end.

jrotieno commented 3 weeks ago

Holding onto the issue as the docs get migrated to GitHub