theiagen / public_health_bacterial_genomics

GNU Affero General Public License v3.0
26 stars 14 forks source link

Re-root phylogenies #200

Closed emmadoughty closed 1 year ago

emmadoughty commented 1 year ago

For all phylogenetics workflows, it would be great to re-root the final trees for easier visualization and interpretation by users. Maybe we can midpoint root the trees by default, or root on an outgroup if specified by the user (as an optional wf input).

As an example of the impact, these two trees (Salmonella outbreak) are identical and the same SNP matrices are shown alongside. In the tree produced by kSNP3 as default (top image), the outbreak cluster is split. In the bottom image, I have midpoint rooted the tree, which nicely clusters all the isolates together that are part of the outbreak, making it easier to interpret. image

image

Trees can be re-rooted with the bio.phylo python package, previously used for re-ordering the SNP-matrices. Midpoint rooting can be undertaken with the root_at_midoint() function and re-rooting on an outgroup can be undertaken with root_with_outgroup().