matsengrp / gctree

GCtree: phylogenetic inference of genotype-collapsed trees
https://matsengrp.github.io/gctree
GNU General Public License v3.0
16 stars 2 forks source link

Local branching metrics fixes #96

Closed wsdewitt closed 2 years ago

wsdewitt commented 2 years ago

Summary

Correct message passing

The following excerpt from Neher et al. (2014) describes the LBI message passing scheme. gctree was, in effect, incorrectly computing the sum $\sum{k\ne j}$ in Eqn (18) as $\sum{k}$ (i.e. not excluding $j$ in the summation).

image

This has been corrected in gctree.CollapsedTree.local_branching(), and an additional keyword option infinite_root_branch (default True) has been added to match the partis implementation.

Unit test

The new unit test tests/test_local_branching.py validates the LB messages, and LBI and LBR metrics, for the following simple collapsed tree:

image

The branch lengths are 1, 2, and 3 (matching the abundance indicated on the nodes), and the root node has abundance zero. @psathyrella can you produce the partis LB metrics for this tree for comparison?

matsen commented 2 years ago

Thanks for this, @wsdewitt. I hope this resolves the differences between your and Duncan's implementations. I do think that dissecting things with the unit test you added is great. However, before merging I'd want to make sure that you return the same results as Duncan on some of the bigger inferred trees. Have you had a chance to do that?

wsdewitt commented 2 years ago

Per discussion with @matsen today, we will go ahead with merging this so pipeline rerun and analyses can proceed, but will follow up with @psathyrella for partis comparisons on this unit test and on larger trees.