matsengrp / ecgtheow

Ancestral lineage reconstruction using BEAST or RevBayes
2 stars 2 forks source link

Improving display of graph #2

Closed matsen closed 7 years ago

matsen commented 7 years ago

Here's how things look on a 10,000 sample run, cutting out 1000 burn-in samples:

bf520 1-h-igh family_0 healthy tre seedpruned 100 ids aa_lineage_graph

It seems to me that the next step, and perhaps a good place for @dunleavy005 to step in, is to do shading in proportion to confidence. Note that GraphViz supports RGB colors. I'd suggest just using Red-Green-Blue-Alpha where alpha varies according to confidence. If we wanted to drag in extra dependencies, we could use Matplotlib's color schemes, but that seems overkill here.

It would be nice to have this shading for both the edges and the nodes. For the nodes we should consider how to do shading while still being able to see the text.

I also think it might be nice to label the edges of the graph with the AA mutations. There is code to do this sort of thing in tabulate_mutations.py:

def find_muts(orig, mutated):
     return [
        "{}{}{}".format(o, idx+1, m)
        for idx, (o, m) in enumerate(zip(orig, mutated)) if o != m]
dunleavy005 commented 7 years ago

Via https://github.com/matsengrp/ecgtheow/commit/093b84e26768183d12169fcf1ed7c090c90914a0, we have a easier to read graphviz plot with confidence shadings for edges and nodes. (Read the mutation labels at your own risk!)

bf520 1-h-igh family_0 healthy tre seedpruned 100 ids aa_lineage_graph

matsen commented 7 years ago

This is sooooo awesome!

After you address my little comments, time to close this issue, unless we have further aesthetic ideas. Perhaps we can make the edge labels one point smaller than the node labels? Do we have any interest in making the arrows #000080 or something?

dunleavy005 commented 7 years ago

Yeah totally, the edge fonts could get smaller.

I did think about color for edges, but figured that shading represents the only variable (namely, confidence) and color would just be there.. for color's sake. I assumed black is the easiest to perceive b/c its a plain color and everyone's used to it, but I'll try switching it up to #000080 (navy blue) and will see if it looks better.

matsen commented 7 years ago

The only argument for color is that then the edge text and edges intersect less.

dunleavy005 commented 7 years ago

Behold! image

matsen commented 7 years ago

Gorgeous. I vote to close.

dunleavy005 commented 7 years ago

OK, I'm good with that.

matsen commented 7 years ago

Amrit-- I'm getting this error:

(cft) stoat matsen/ecgtheow ‹master› » python/trees_to_counted_ancestors.py --burnin 1000 --seed BF520.1-igh --filter 100 runs/2017-07-10/BF520.1-h-IgH.family_0.healthy.tre.seedpruned.100.ids.trees data/2017-07-10/BF520.1-h-IgH.family_0.healthy.tre.seedpruned.100.ids.fasta
Traceback (most recent call last):
  File "python/trees_to_counted_ancestors.py", line 137, in <module>
    dot.attr(size='15,15', ratio='fill', fontsize='9')
TypeError: attr() takes at least 2 arguments (1 given)

What version were you running?

>>> import graphviz
>>> graphviz.__version__
'0.5.2'
dunleavy005 commented 7 years ago

OK, so it seems conda installs version 0.5.2, but locally pip has installed the latest version 0.8. According to the changelog https://graphviz.readthedocs.io/en/latest/changelog.html#version-0-7, .attr() isn't fully defined until version 0.7. I've modified the graphing code to utilize the pre-0.7 API (see https://github.com/matsengrp/ecgtheow/commit/bd650a187b7d52416dae06bdde4aeccfacb743e5).