sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
324 stars 189 forks source link

does gubbins work with core_gene_alignment.aln? #445

Open liaochenlanruo opened 5 years ago

liaochenlanruo commented 5 years ago

The webpages (https://sanger-pathogens.github.io/Roary/) said We cannot use the output of Roary as the input to Gubbins, do you mean the "core_gene_alignment.aln" file cannot be used as the input to Gubbins?

I also want to known if the output of snp-sites can be used as the input to Gubbins.

Thanks a lot!

pneumowidow commented 5 years ago

From what I understand, Gubbins requires the whole genome alignment because it uses a sliding window approach relative to the length of the whole genome to calculate the rate of recombination and mutations. So, if you took just a subset of this whole genome (i.e., the core genome) it might make false positive inferences of recombination. The developers have answered a similar question here [https://github.com/sanger-pathogens/gubbins/issues/169] if you want to take a look.

Recommendation: You could use Torsten Seeman's https://github.com/tseemann/snippy to find high quality core snps and then use snippy-clean_full_aln to get the full wgs alignment and afterwards use that as your input for Gubbins.