theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
33 stars 15 forks source link

[Snippy_Streamline] enhancements #141

Open michellescribner opened 11 months ago

michellescribner commented 11 months ago

Potential enhancements of Snippy Streamline to be discussed:

sam-baird commented 10 months ago

Hello @michellescribner,

Could you please explain why setting use_gubbins = true with a reference genome with multiple contigs could be problematic? I have two N. meningitidis reference genomes, one is complete and the other has nine contigs. I would prefer to use the one with multiple contigs because it appears much more closely related to my isolates, and I would still like to mask recombination.


andrewjpage commented 10 months ago

Hi, I wrote Gubbins and also work for Theiagen. The intended input is a multifasta alignment against a chromosome in one piece. You could force a draft genome into this shape, however the scanning statistic may give inaccurate results around the artificial joins between contigs so its best not to do this. It works fine to have a reference thats a bit further away (acting as an outgroup as well) and lets you see recombination a lot clearer. Regards, Andrew