evolbioinfo / gotree

Gotree is a set of command line tools and an API to manipulate phylogenetic trees. It is implemented in Go language.
GNU General Public License v2.0
118 stars 15 forks source link

Adding tip annotation to `gotree draw` #19

Closed lucblassel closed 2 years ago

lucblassel commented 2 years ago

Goal of the PR

This is an attempt at allowing users to highlight certain tips with colored circles like this:

annotated tree

When using the gotree draw png/svg command, users can specify a tab separated annotation file that has the tip name as well red, green and blue values for each tip that the user wants to highlight (one per line).

The following file was used to generate the figure above.

Tip1    255 0   0
Tip2    0   255 0
Tip3    0   0   255

I think this could be useful when dealing with large trees, most tree visualisation tools (e.g. iTol or some R libraries) are not very well suited to large trees. I think that gotree generates image files pretty fast which is very useful to get an idea of what the tree looks like, if there are any weird things as a cursory check. The ability to highlight certain nodes (like lineages, or sampling region, etc...) can be very useful to get a rough idea if the tree seems OK or if something is wrong (e.g. a given lineage that is spread out all over the tree).
I have built a large HIV tree (90k tips, and 2 subtypes) as an example. I highlight tips from one subtype in Red and the others in blue. We can see that the subtypes are well separated which means that there is some merit to the tree:

big

Implementation

I changed added a persistent flag to the draw command (it is ignored for ASCII and cytoscape outputs)

I also added a DrawColoredCircle method to the TreeDrawer interface, that takes additional uint8 parameters to specify an RGBA color. I think it can easily be merged with the existing DrawCircle method, I just did not want to mess with the existing code.

Similarly I added a SetTipColors method to the TreeLayout interface. When implemented (in png and svg) it can add a map[string][]uint8 to the chosen layout struct, that is parsed from the annotation file.

Issues

The main issue I have is the interaction with tip labels, since the labels are drawn first they sometimes disappear underneath the annotation as: tree with overlapping labels and annotations

As of now I have not understood exactly how to offset the labels so I have went with the very hacky solution of adding spaces before and after the tip label text if the tip node is drawn. This does not work in every scenario (it doesn't work with the normal layout for example) but I have not figured out a better way to do this, maybe you know @fredericlemoine ?

Conclusion

I think this is a useful addition, but if there is anything wrong with it or stuff you want me to change I'm all ears.

lucblassel commented 2 years ago

I added the ability to specify an annotation file with hex color codes as well as RGB values, the 2 following tab separated files should yield identical figures:

Tip1    255 0   0
Tip2    0   255 0
Tip3    0   0   255

and

Tip1    #ff0000
Tip2    #00ff00
Tip3    #0000ff

you can also specify hex color codes with the alpha channel which will be ignored.

fredericlemoine commented 2 years ago

Thanks @lucblassel , it's very useful! We can discuss later of potential improvements.