theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[TheiaProk] Add emmtyper tool to GAS TheiaProk track #413

Closed cimendes closed 4 days ago

cimendes commented 2 months ago

:cool:

:pushpin: Explain the Request

https://github.com/MDU-PHL/emmtyper

BLAST-based, uses assemblies

:books: Context

:chart_with_upwards_trend: Desired Behavior

:information_source: Additional Information

sam-baird commented 1 week ago

Hi, I'd like to work on a pull request for this if someone is not already working on it. When running a 32-sample validation dataset through the GAS_identification workflow, I noticed a couple discrepancies between the emm-typing-tool results and emmtyper results, so I think it would be nice if TheiaProk ran both tools. My plan would be to basically take the emmtyper task from GAS_identification and add the script to tasks/species_typing/streptococcus/, then call the task in wf_merlin_magic.wdl immediately below if (merlin_tag == "Streptococcus pyogenes").

kapsakcj commented 1 week ago

We would ❤️ that. No one is actively working on this.

BTW we already have a WDL task file at tasks/species_typing/streptococcus/task_emmtyper.wdl, it's just not used in any workflows.

Though I'll note that Neranjan's WDL does have some additional code to parse out the final emm type, so probably best to use his copy as a starting point

sam-baird commented 1 week ago

Sounds good, thanks!