eaton-lab / toytree

A minimalist tree plotting library using toyplot graphs
http://eaton-lab.org/toytree
BSD 3-Clause "New" or "Revised" License
169 stars 28 forks source link

Please add an easy way of labelling clades #26

Closed SimonGreenhill closed 4 years ago

SimonGreenhill commented 4 years ago

The current way of highlighting a clade (finding x and y) is cumbersome. It would be really helpful to just be able to list a set of tips and generate the label. Ideally, I'd love something like these labels for Cercopithecidae, Hominidae etc (random tree found on the internet):

Primate-Phylogeny-Chart

aalsabag commented 4 years ago

@SimonGreenhill I have created a solution in my fork for something like this. It's a new method called generate_rectangle(). It takes in a firstname and lastname and it would be called like this:

import toytree
import time
import toyplot

# Random yeast phylome I pulled from phylomedb
tre = toytree.tree("./Phy000CVJN_YEAST.JTT.nw")
rtre = tre.root(wildcard="Phy000O9AG_SACPA")

canvas = toyplot.Canvas(width=750, height=800)
axes = canvas.cartesian()
axes.show = True

firstname = "Phy000H8FQ_BOTFU"
lastname = "Phy000FPL9_ASPFU"

# This is how you would call the method
axes = rtre.generate_rectangle(firstname=firstname, lastname=lastname,color=toytree.colors[1],axes=axes)

rtre.draw(tip_labels_align=True, axes=axes, tip_labels='idx');

The result is something like this: image

I'll raise a PR and see if these guys accept it or if they have any feedback (or if you have any feedback) that I can easily fix. But if you want to use it now, you can just pull down my code

git clone git@github.com:aalsabag/toytree.git
git checkout issue-26
python setup.py install
SimonGreenhill commented 4 years ago

cool, thanks :)

eaton-lab commented 4 years ago

Hi @aalsabag @SimonGreenhill , Thanks for the pull request! Great contribution. I have been thinking about how best to fit "annotation" tools like this into the codebase. I like your suggestion to make it a separate function call from .draw() that accepts axes as an argument. Currently, an example like @SimonGreenhill showed could be made through a combination of axes.rectangle and axes.text calls, but I can see why automating this process into a single function call would be convenient.

I'm thinking I'd like to organize such functions into an 'annotate` submodule. The syntax might look like this:

# get random tree
rtre = toytree.rtree.unittree(8)

# draw tree
canvas, axes, mark1 = rtre.draw();

# add block around tips x-z
mark2 = rtre.annotate.clade_rectangle(
    axes=axes, 
    tips=['r3', 'r0'],                              # gets y-positions of rect from name matching
    offset=20,                                      # how far from farthest tip to draw rect
    width=25,                                       # width of rectangle from offset to offset+width
    style={"fill": "red", "fill-opacity": 0.5},     # rect style
    text="Clade X",                                 # name placed near rectangle
    text_style={"font-size": "14px"},               # text style (and positioning maybe?)
)

Two tricky things:

  1. I think users will often want the rectangle to surround the tip names, or to be offset past the tipnames, like in Simon's example, rather than on the tip edges.
  2. If so, then getting the width of the tip names text before it is rendered can be very difficult (since the font, font-size, etc affect this size in pixels). This can make it hard to get the rectangle width just right.
  3. vertical-width is easy (assuming layout='r') since we can just add 0.25 or so to the top and bottom b/c the space between tips is always 1 by default.
  4. the horizontal width of the rectangle is where the text size matters. It is easy to just draw once then adjust, but better to automate this. If you were drawing multiple boxes to show several different clades you would probably want to just use the width of the longest name so all boxes are the same width. Calculating the 'extent' of text is surprisingly tricky, but it is done in the code at several points.
  5. My theoretical example above would allow using tipnames to select the positioning and add a name and rectangle with styling, and adjust the x-position and width as an argument in pixels. I'm not sure if in the end this is easier than just using separate rectangle and text calls from toyplot. I tend to lean towards minimalism and think the pure toyplot annotation is cleaner, with documented examples provided in the toytree cookbook.
  6. I think this would be a good first function for a new annotate submodule though, and I'm sure we could refine the syntax as we develop it and come up with other similar functions.
aalsabag commented 4 years ago

Thanks for the great feedback! I am definitely a fan of this proposal @eaton-lab . I'm gonna create a new module for annotations and try to get the boxes to surround the tip names as well.

There is a lot of potential for the annotation module. I want to add some graceful error handling too.

I'll get to it this weekend or next week.

eaton-lab commented 4 years ago

Hi @aalsabag ,

Thanks for the contribution! I moved and modified your code to put it into an annotation module in toytree.utils. I think this is a good place for it for now. Here is a notebook with an example implementation:

https://github.com/eaton-lab/toytree/blob/master/sandbox/26-highlight-clades.ipynb

aalsabag commented 4 years ago

@eaton-lab this implementation is awesome! Very elegant solution you have using this get_mrca_idx_from_tip_labels. I must have missed that method. I didn't get what you meant when you said : "This makes it difficult to create automated functions like the .draw_clade_box() annotation function above that will work generally for any tip names" I've tried it with a bunch of different sizes and it seems to work beautifully.