matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 18 forks source link

guppy fat sets many branch lengths to 0.01 #234

Closed joelb123 closed 12 years ago

joelb123 commented 12 years ago

Hi,

Thanks for writing pplacer. I'm enjoying using it thus far.

My version is pplacer-v1.1.alpha11rc1-Linux-2.6.32. I'm using it under Gentoo linux. Everything is highly up-to-date. I did not build from source.

It seems that guppy fat messes with branch lengths. It's my understanding from the pplacer paper that "pplacer placements all sit on a single reference tree with associated branch lengths fixed". Yet the branch lengths from guppy fat are not the same as in the input tree. In particular, the really fat branches get set to exactly 0.01 for my tree:

$ xml_grep branch_length guppy_fat_tree.phyloxml |grep 0.01<

0.01

...20 more examples suppressed..

Presumably this was done for some reason involving readability, but it has the effect of obscuring the effects one is looking for. Is it easily switched off?

matsen commented 12 years ago

Yes, this is easily turned off. See the online documentation:

--min-fat       The minimum branch length for fattened edges (to increase their visibility). To turn off set to 0. Default: 0.01

If you could make some suggestions for clearer documentation we will incorporate them.

joelb123 commented 12 years ago

From: Erick Matsen [reply@reply.github.com] Sent: Wednesday, February 29, 2012 11:20 AM To: Berendzen, Joel R Subject: Re: [pplacer] guppy fat sets many branch lengths to 0.01 (#234)

Yes, this is easily turned off. See the online documentation:

--min-fat       The minimum branch length for fattened edges (to increase their visibility). To turn off set to 0. Default: 0.01

If you could make some suggestions for clearer documentation we will incorporate them.


Oh, my apologies for failure to RTFM. Please close this issue.

I think the documentation is pretty good. A glossary would be helpful (e.g. pquery), and example outputs from each of the guppy commands would also be useful. I got surprised when guppy tog produced a Newick file rather than a phyloxml file, for example. I would be a bit happier if the data path were phyloxml all the way through since it seems I have to lose information (names, colors) in making the refpkg.

The other mystery I'm trying to figure out is how to get a vector of placements and counts. From what I can tell guppy info doesn't do this. For visualizations I can think of doing but which guppy doesn't implement, this would be great. I suppose I could also get this from the jplace files directly, but in the absence of example scripts this seems a bit daunting.

matsen commented 12 years ago

Re XML, you can get that if you want using the --xml flag.

The placement format is designed to be trivial to parse using any language with a JSON parser. For example, see the check_placements.py script that is included in the distribution and described in the documentation.

I suggest using the mailing list for future correspondence.

Thanks,

Erick