Closed maesaar closed 7 years ago
Its certainly strongly discouraged! I would say they are missing SNPs/data from the alignment that they used to build their tree. The real question is if these missing SNPs actually makes any real difference to the final result?
So it depends on the lost SNPs - thanks.
Hi Andrew,
I had a question about those comments. Do you think that it could be suitable to do the following steps to overcome these issues with Roary-based alignement:
1.Run gubbins on each core-gene alignment independently (i.e. if 2000 core genes ==> 2000 independent alignements ==> 2000 gubbins runs)
Thanks!
Guilhem
Hi Guilhem, Sorry I'm afraid thats not going to work. Gubbins needs to use the whole genome to detect regions of increased SNP density and doesn't work on a small scale (like the gene level). In a pan genome context, recombination will probably be represented as different clusters in the accessory genome rather than being in the core. Regards, Andrew
It's also what I was afraid of, but I was not sure. Thank you for your answer and the useful comment on core gene recombination !
I have created/read the Issues page (issue (#267)) where it is said that core genome alignment from Roary is not suitable for Gubbins to detect recombination. But from time to time I find publications where following steps "Roary -> core alignment (PRANK) -> Gubbins" is used.
For example in PhD thesis (http://www.bib.fcien.edu.uy/files/etd/biol/uy24-18262.pdf) page 254 section 6.3.4 and 6.3.5 says following:
Second example publication (http://aem.asm.org/content/early/2016/04/04/AEM.00362-16.full.pdf+html) supplementary material (http://aem.asm.org/content/suppl/2016/05/19/AEM.00362-16.DCSupplemental/zam999117195so1.pdf) page 2 Figure S1 says following:
Are these publications methodologically sound?