lmaurits / phyltr

Unix filters for manipulating and analysing (samples of) phylogenetic trees represented in the Newick format
GNU General Public License v3.0
3 stars 4 forks source link

Somewhere in phyltr, all tips are aligned #8

Open Anaphory opened 6 years ago

Anaphory commented 6 years ago

I ran

phyltr cat calibrated_tree_both.nex -b 50 -s 5 | phyltr consensus | phyltr cat --no-annotations > consensus.nex

to generate a consensus tree from a beast run. The resulting tree has all tips at height zero, even though the original file contains tips at positive heights.

lmaurits commented 6 years ago

Nice catch! If I had to guess I would place the problem in the consensus command. Of course, BEASTling had no support for tip date sampling at the time this command was first written, so this is not too surprising. Hopefully the fix is straightforward, I'll have a go at this tomorrow.

Anaphory commented 6 years ago

Given that cat doesn't actually touch the trees, that's the only place. I hope the fix is not too hard.

lmaurits commented 6 years ago

Just starting to look at this now, and yikes, I already found a bad bug in how the branch lengths leading to leaves are set. Well, bad in terms of how much I cringed when I saw the code (wherein I overwrote a variable value before using it as if it still had the original value), I don't know how bad it is in terms of the outcome on the tree...

lmaurits commented 6 years ago

Ugh, this is going to require a change to the CladeProbabilities object to keep track of node heights.

lmaurits commented 6 years ago

I've pushed some code which attempts to deal with this issue. I'm not 100% sure it's correct yet. It does introduce a test failure, but I'm not sure yet whether this is due to a bug in the code or a bug in the intuitive expectations encoded by the test (or both!). Can you try this on your calibrated_tree_both.nex file and see if the results are either as you expect or at least closer to what you expect than they previously were?

lmaurits commented 6 years ago

Okay, with a little more thought it was a code bug. Tests should be passing now and I am moderately confident consensus is doing the right thing? Your consensus tree should now have leafs whose heights are the mean of their sampled heights, which should correspond to the means of your calibration intervals if you have any.