Closed cimendes closed 1 year ago
FYI let's wait to fix the CI until we've made all the changes we discussed to the SRST2 task & workflows
File changes look good! I ran things in a sandbox with the updated defaults as well. Functionally everything is looking great. Well done, all!
Motivation
An Abricate database of target genes for Vibrio characterization was constructed, with the corresponding PR being open at https://github.com/StaPH-B/docker-builds/pull/618.
This docker image includes a Vibrio cholerae-specific database of gene targets (traditionally used in PCR methods) for detecting O1 & O139 serotypes, toxin-production markers, and Biotype markers within the O1 serogroup ("El Tor" or "Classical" biotypes). These sequences were shared via personal communication with Dr. Christine Lee, of the National Listeria, Yersinia, Vibrio and Enterobacterales Reference Laboratory within the Enteric Diseases Laboratory Branch at CDC.
The genes included (and their purpose) included in the database are as follows:
ctxA
- Cholera toxin, an indication of toxigenic choleraeompW
- outer membrane protein, a V. cholerae species marker (alleles distinguishes V. cholerae from V. parahaemolyticus and V. vulnificus)tcpA
- toxin co-pilus A, used to infer Biotype, either "El Tor" or "Clasical"tcpA_classical
andtcpA_ElTor
toxR
- transcriptional activator (controls cholera toxin, pilus, and outer-membrane protein expression) - Species marker (allele distinguishes V. cholerae from V. parahaemolyticus and V. vulnificus)wbeN
- O antigen encoding region - used to identify the O1 serogroupwbfR
- O antigen encoding region - used to identify the O139 serogroupUntil further testing, the current container included in the workflow is
quay.io/kapsakcj/srst2:0.2.0-vcholerae
A new task
task_srst2_vibrio.wdl
was included that runs srst2 with the custom vibrio database, and the resulting hits on the gene sequences are reported. The task was included inmerlin_magic_workflow.wdl
for any sample identified as belonging to the genusvibrio
. This has been implemented in both ´TheiaProk_Illumina_PEand ´TheiaProk_Illumina_SE
.The following outputs are retrieved:
Testing
The workflow has been tested in 152 V. cholerae sequence runs on Terra using Theiaprok_Illumina_PE
Theiaprok_Illumina_SE has been tested locally with sample SRR7062492 as importing the workflow with the correct branch was not possible in Terra (@kapsakcj have you seen this issue before?)