Open liaochenlanruo opened 5 years ago
From what I understand, Gubbins requires the whole genome alignment because it uses a sliding window approach relative to the length of the whole genome to calculate the rate of recombination and mutations. So, if you took just a subset of this whole genome (i.e., the core genome) it might make false positive inferences of recombination. The developers have answered a similar question here [https://github.com/sanger-pathogens/gubbins/issues/169] if you want to take a look.
Recommendation: You could use Torsten Seeman's https://github.com/tseemann/snippy to find high quality core snps and then use snippy-clean_full_aln
to get the full wgs alignment and afterwards use that as your input for Gubbins.
The webpages (https://sanger-pathogens.github.io/Roary/) said We cannot use the output of Roary as the input to Gubbins, do you mean the "core_gene_alignment.aln" file cannot be used as the input to Gubbins?
I also want to known if the output of snp-sites can be used as the input to Gubbins.
Thanks a lot!