xavierdidelot / ClonalFrameML

ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes
GNU General Public License v3.0
105 stars 26 forks source link

R script plot colours are not explained #141

Closed sam-spence closed 1 year ago

sam-spence commented 1 year ago

The user guide and paper are missing a description of the colours used in the pdf plot from the R script. It seems from the paper that white is supposed to mean substitutions and dark blue is for recombinations? And grey is for non-core regions? I haven't been able to find a description anywhere of what sky blue, orange, and yellow mean.

Edit: I can see that this was asked a lot in the closed issues and wasn't added to the manual. However it's still not clear to me, for example if sky blue means there are no substitutions, then what does it mean when one genome has sky blue at site X and another genome has yellow at the same site? It would seem that the first genome can't have 'no substitutions' if there is a SNP there. Or does the sky blue mean that there was no change on that leaf since the previous ancestral node when walking back up the tree?

xavierdidelot commented 1 year ago

Yes, sky blue means there is no substitution for a site on a given branch. There might still be substitutions for the same site on other branches, in which case the whole column will not be sky blue.

sam-spence commented 1 year ago

Thanks a lot for your reply. So is the meaning of 'substitution' defined by looking at the last node up the tree? So if a site in the ancestral node is a C, and this node branches out to genomes X and Y where X also has a C at that site but Y has a T, then the site will be sky blue on genome X and another colour white/yellow/red on Y? And hence looking back further, the site on that ancestral node would only be blue if it matched the previous node up the tree?

xavierdidelot commented 1 year ago

Yes that's exactly it!